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ABSTRACT 

This  Semiannual  Technical  Summary  covers  the 
period  1  April  1982  through  30  September  1982. 
It  describes  the  significant  results  of  the 
Lincoln  Laboratory  Multi-Dimensional  Signal 
Processing  Research  Program  sponsored  by  the  Rome 
Air  Development  Center,  in  the  areas  of  target 
detection,  adaptive  contrast  enhancement,  and 
image  processing  architectures. 
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NTRODUCTION  AND  SUMMARY 


The  Lincoln  Laboratory  Mult i-Dimensionai  Signal  Processing  Research 
Program  was  initiated  in  FY80  as  a  research  effort  directed  toward  the 
development  and  understanding  of  the  theory  of  digital  processing  of  multi¬ 
dimensional  signals  and  its  application  to  real-time  image  processing  and 
analysis.  'A  specific  long-range  application  i3  the  automated  processing  of 
aerial  reconnaissance  imagery.  Current  research  projects  that  support  this 
long-range  goal  are  image  modeling  for  segmentation,  classification  and 
target  detection;  techniques  for  adaptive  contrast  enhancement;  and  multi¬ 
processor  architecture  to  implement  image  processing  algorithms. 

This  Semiannual  Technical  Summary  discusses  results  in  three  areas.  In 
Section  2  we  present  a  detailed,  technical  description  of  our  efforts  in 
target  detection.  The  problem  is  formulated  as  a  significance  test  to 
determine  whether  a  small  region  contains  one  or  more  pixels  (picture 
elements)  that  do  not  match  the  measured  statistical  properties  of  the 
background.  Examples  are  shown  in  which  the  resulting  detection  algorithms 
are  applied  to  aerial  reconnaissance  photographs. 

Section  3  contains  program  documentation  information  for  the  adaptive 
contrast  enhancement  software  delivered  to  RADC/IRRB  earlier  thia  year.  It 
includea  a  deacription  of  all  the  program  modes,  parameters,  and  user 
interaction  as  well  as  the  typescript  from  a  sample  run. 

In  Section  4  we  discuaa  our  latest  thoughts  on  a  multi-processor 
architecture  for  image  proceaaing  applications.  Attention  is  focused  on  the 
architectural  requirements  for  a  tingle  nodal  processor.  (Aa  currently 
planned,  the  mult i-proceaaor  will  consiat  of  16  such  nodal  processors.)  The 
architectural  study  is  by  no  means  complete,  but  we  have  investigated  several 
architectural  principles  that  appear  to  support  the  goals  of  high 
computational  throughput  and  ease  of  programming.  In  FY83,  we  plan  to 
continue  in  thia  vein,  refining  the  architecture  and  developing  an 
instruction  set  for  the  nodal  processors. 
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2.  TARGET  DETECTION  BY  TWO-DIMENSIONAL  LINEAR  PREDICTION 


2.1  Introduction 

This  research  relates  to  the  problem  of  detecting  targets  (i.e., 
anoraolous  areas)  in  aerial  photographs.  We  define  the  target  detection 
problem  as  the  detection  of  man-made  objects  in  a  textured  background  (e.g., 
trees,  grass,  fields). 

In  a  previous  Semiannual  Technical  Summary  (1],  we  developed  a  target 
detection  algorithm  based  on  the  coefficient  change  function  (CCF).  We 
looked  for  changes  in  the  parameters  of  a  continuously  varying  autoregressive 
model  of  aerial  photographs.  We  hypothesized,  however,  that  the  two- 
dimensional  (2-D)  prediction  error  may  provide  a  significant  improvement  over 
this  CCF  algorithm. 

In  this  section,  we  develop  such  an  algorithm  and  provide  a  rigorous 
theoretical  basis  for  it  in  terms  of  significance  testing  [2).  Our  detection 
algorithm  is  derived  from  the  fact  that  significance  testing  can  be  expressed 
in  terms  of  the  error  residuals  of  2-D  linear  prediction.  We  first  develop 
the  algorithm  under  a  stationary  Gaussian  assumption  and  then  proceed  to 
generalize  it  for  the  case  where  the  background  ia  nonstationary.  This  more 
general  teat  involves  determining  the  error  residuals  from  an  optimal  apace- 
varying  predictor.  The  error  residuals  are  combined  over  a  small  area, 
suitably  normalized  and,  finally,  compared  to  a  threshold.  Since  a  causal 
2-D  prediction  filter  ia  associated  with  our  significance  test,  we  can 
interpret  this  as  modeling  the  background  by  a  2-D  causal  space-varying 
autoregressive  random  process.  The  parameters  of  this  autoregressive  model, 
therefore,  need  to  be  estimated  from  the  background. 

Our  detection  algorithm  (as  we  shall  demonstrate  with  examples)  haa  been 
applied  successfully  to  both  synthetic  images  and  actual  reconnaissance 
photographs  obtained  fioo  the  Rome  Air  Force  Development  Center  (RADC). 
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2.2  Significance  Testing 


The  problem  of  target  detection  in  images  is  finding  small  areas  in  an 
image  whose  statistical  properties  do  not  match  those  of  the  surrounding  area 
or  background.  Since  the  target  statistics  are  generally  unknown  (it  is 
desired  to  detect  broad  classes  of  targets),  and  the  background  statistics 
may  be  known  or  can  be  estimated,  the  problem  is  inherently  different  from  a 
classic  detection  or  classification  problem.  Essentially,  the  only  question 
that  can  be  asked  is  "Does  a  set  of  pixels  under  examination  represent 
background  or  does  it  represent  something  else?"  The  area  of  statistics  that 
addresses  such  questions  is  called  significance  testing  (2).  The  basic  idea 
is  illustrated  in  Figure  2-1.  A  measurement  is  made  of  some  random 
phenomenon  characterized  by  probability  density  p(x).  Critical  regions 
(i.e.,  regions  of  low  probability)  are  chosen  corresponding  to  unlikely 
events.  If  measurements  fall  in  the  critical  region,  we  reject  the 
hypothesis  that  the  measurements  really  belong  to  the  density  p(x). 


Fig.  2-1.  Illustration  of  significance  testing. 
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In  our  case,  the  probability  density  is  that  of  the  background.  If  a 
set  of  measurements  fails  in  the  critical  region,  we  reject  the  hypothesis 
that  the  pixels  form  a  part  of  the  background;  that  is,  we  decide  that  they 
represent  (at  least  partly)  a  target.  The  significance  level  of  the  test  is 
determined  by  the  probability  of  events  in  the  critical  region.  For  example, 
if  this  probability  is  .05,  then  the  significance  test  is  at  the  51  level. 

The  reasoning  involved  in  the  significance  test  may  seem  indirect. 
However,  since  only  the  background  has  known  or  estimable  statistics,  this  is 
a  logical  line  of  reasoning  to  take. 

2.2.1  Formulation 


Given  an  image  and  small  region  S  at  (n,m),  let  the  image  points  in  S 
(see  Figure  2-2)  be  denoted  by 


{vv* 


•V 


(1) 


We  want  to  decide  whether  the  points  in  S  corresponds  to  a  homogeneous  random 

field  with  probability  density  p(at)  (i.e.,  S  contains  just  background)  or 

whether  S  contains  something  other  thsn  the  homogeneous  random  field  (object 

possibly  present).  We  want  to  do  this  for  all  (n,ra). 

Let  the  btckground  correspond  to  a  stationary  Gaussian  random  process 

T 

with  mean  m  *  E(x)  and  covariance  R*E{x-m) (x-m)  1.  Then 


p(x) 


1  T  -1 

exp  (-  (x-tn)  K  (*-») ) 


(2) 


We  shall  plot  the  probability  density  function  and  determine  a  critical 
region  of  small  probability  (ci)  which  is  the  level  of  significance  (see 
Figure  2-3).  Let  be  the  hypothesis  that  the  points  in  S  correspond  to  the 
homogeneous  random  field.  Then  we  accept  (decide  "background  only")  if 
the  points  in  S  do  not  fall  in  the  critical  region.  Otherwise,  we  reject 
(decide  "more  than  just  background").  Thi«.  is  repeated  for  every  (n,m). 
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A  critical  region  C  can  be  defined  by  p(iO<A.  The  relation  between  A 
and  the  level  of  significance  a  is 


/p(x)djt  »  a 

pi  x)  <A 


(3) 


Taking  the  logarithm  of  p(x)<A,  our  significance  test  becomes: 

(x-m) V1  (x-m)  >  -in(2a)NjK|-2inA  (4) 

Suppose  that  we  wish  to  maintain  a  constant  level  of  significance 
regardless  of  the  background  statistics  (i.e.,  regardless  of  the  covariance 
K) .  Tina  is  equivalent  to  enforcing  a  constant  false  alarm  rate.  To  do  this 
we  will  find  the  particular  form  that  A  must  take  as  we  vary  K.  We 
investigate  the  1-D  case',  the  N-D  caae  is  more  complicated  and  is  given  in 
the  appendix. 

Let  the  density  function  of  the  teto-tnean  random  variable  x  take  the 

form 


p(x)  -  — L_  eXp  (-**720^)  (5) 

SI*  o 

Then  the  boundaries,  fx^  ,  of  our  critical  region  are  given  by  p(iXg)  *  A. 
From  Equation  5  it  follows  that 

x  -  (-2o2ln(/3ToA>)l/2  (6) 


The  level  of  significance  in  thia  case  is  given  by 

2  /*  — 1 —  exp  {-x2/2o2]dx  •  o  (7) 

*o  /2i-o 
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Now,  let  u-x/o ,  so  that 


2  f  /  — —  exP  t“u^/2]du  *  a  (8) 

V0  m 

For  a  constant  a,  we  must  have  (from  Equations  6  and  8): 

1/2 

x  /o  ■  (-2JZ.n(/I?  oX))  m  C  (9) 

o 

where  C  is  a  constant.  Therefore,  X  must  vary  as 

X  -  (10) 

/TT  o 

That  is,  X  mast  vary  inversely  as  the  standard  deviation  to  guarantee  a  level 

of  significance  that  doea  not  vary  with  the  background  ststiatics. 

2 

Evaluating  Equation  4  for  the  1-0  cage  where  K*|K|“0  , 

(x-ta )2/o2  »  -i.n(2*o^)-2*nX 

>  -  ln(2«a2C2/2»o2) 

>  -  loC2  -  C’  (11) 

where  C'  ia  a  constant  threshold,  Thus,  in  the  1-D  case,  our  significance 
tost  involves  comparing  the  squared  random  variable  (with  mean  removed  and 
appropriately  normalised)  to  a  constant  threshold, 

Kore  generally,  we  can  show  that  for  a  constant  level  of  significance, 
in  the  N-D  case  (see  appendix),  X  oust  vary  da 

X  -  C/(2*)ii/2  |Kj  1/2  (12) 

Thus,  Equation  4  becomes: 

U-o)V1(x-b)>C'  (13) 
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2.2.2  Relationship  of  Significance  Testing  with  Linear  Prediction 


We  will  now  show  that  the  significance  test  of  the  previous  section  can 
be  expressed  in  terras  of  the  error  residuals  by  optimally  predicting  each 
sample  of  our  small  region  S  by  a  linear  combination  of  its  "past"  samples 
("past"  is  defined  below). 

Our  connection  can  be  made  because  the  background  covariance  matrix  K 
can  be  uniquely  factored  in  terras  of  upper  and  lower  triangular  and  a 
diagonal  matrix  [3] .  Let 

T 

K  -  LDL  (14) 


where  L  is  lower  triangular  with  ones  along  its  diagonal,  where  D  is  a 
diagonal  matrix,  and  where  T  denotes  transpose.  Substituting  Equation  14 
into  13,  we  have 

(sc— m ) TK  1  (x-ra)  *  (x~m)T(LDLT)  *  (x-m) 

T 

*  (x-ra)TL  *  D  ^L  ^(x-m) 

T  -1 

»  e  D  e  >  C’  (15a) 


where 


“  L_i  (x-m)  (15b) 

It  is  straightforward  to  show  that  since  L  is  lower  triangular  with  a 

unit  diagonal,  L  has  the  same  property,  and  thus  Equation  15fc  represents  a 

causal  transformation  of  the  vector  jt  [3],  Furthermore,  it  can  be  shown 

[3]  that  each  row  of  L  represents  the  coefficients  required  in  optimally 

predicting  an  element  of  x^  from  its  previous  values,  i.e.,  from  a  linear 

combination  of  x  . . . .x, .  Since  the  covariance  matrix  K  is  that  of  the 
n-i  i 

background,  the  coefficients  of  L  correspond  to  optimal  prediction  of  the 
background  and  not  target  areas.  This  prediction  concept  is  illustrated  in 
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Figure  2-4,  where  we  are  predicting  the  middle  pixel  of  a  3  x  5  region  S  from 
12  of  its  neighboring  values.  An  element  e^of  —  *n  Equation  15b  is  given  by 
k 


(16) 


where  a,  is  the  vector  of  coefficients  for  optimally  predicting  x  from  the 
values  x^,...,x^_^.  The  diagonal  elements  of  D  are  given  by  o  var  (e^). 
Note  that  the  e^'s  are  uncorrelated,  i.e.,  they  form  a  white  process  (3]  when 
the  pixels  being  predicted  (and  doing  the  prediction)  are  background  pixels. 


123368  N 


•  •  •  •  • 

•  •  •  •  • 

Fig.  2-4.  Prediction  of  from  its  previous  values 
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As  illustrated  in  Figure  2-4,  the  mask  required  in  predicting  each 
element  of  S  is  a  growing  nonsymmetric  half-plane  mask  [4]  .  Suppose  that 
the  background  happens  to  be  governed  by  such  a  pixel  relationship,  i.e.,  we 
shall  assume  that  the  background  random  field  follows  an  autoregressive 
process  of  the  form: 

x(n,m)  *  £  £  a( j ,k)x(n-j ,nr-k)  +  w(n,m)  (17) 

(j,k»(0,0) 

where  w(n,m)  is  white  Gaussian  noise  and  where  (k^,i^)  <  denotes  that 

(kpi^)  is  in  the  "past"  of  For  &ny  point  (s,t),  we  define  the  past 

to  be  the  set  of  points  (5]: 

{(k,£)|k*s,  i<t;  k<a,  -J0<  £<»}  (18) 

With  a  90*  rotation  of  coordinates,  the  values  of  S  used  in  predicting  x^ 
can  then  be  those  elements  in  S  that  are  in  the  now  formally  defined  past  of 
.  For  our  purposes,  we  shall  refer  to  the  model  of  Equation  17  as  a  causal 
model,  i.e.,  each  x(n,m)  is  a  function  of  its  past.  Note  that  this  notion  of 
causality  is  tied  to  the  shape  of  the  noasymmat ric  prediction  mask  of  Fig¬ 
ure  2-4. 

Let  us  suppose  that  the  background  follows  an  autoregressive  model  of 
finite  order.  For  example,  let  the  background  follow  a  first-quadrant  causal 
aucoregressive  model.  Then,  except  for  certain  boundary  elements,  each 
element  of  in  Equation  16  consists  of  the  error  in  predicting  each  pixel 
of  S  from  its  first  quadrant  neighbors  (in  a  90*  rotated  coordinate  system). 
This  is  illustrated  in  Figure  2-5  where,  except  for  the  L-shaped  boundary, 
the  fixed-order  prediction  error  equals  e^ .  That  is,  the  remaining 
coefficients  of  the  growing  nonsymmetric  half-plane  mask  of  Figure  2-4  are 
jero  and  thus  do  not  affect  the  prediction.  Generally,  the  background  random 
field  can  be  modeled  by  a  causal  autoregressive  process  of  sufficiently  large 
order  [4],  and  this  model  corresponds  to  the  optimal  linear  prediction  from 
past  values  of  the  2-D  random  field. 
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Fig.  2-5.  Representation  of  f irat-quadrant  fixed-order  predictor. 


Returning  to  our  significance  test,  from  Equation  15a, 

T  -1  ^  \ 

J3  £  “  1  —j  >  constant  (19) 

U-l  o 

x 

2  . 

where  o.^  is  the  prediction  error  variance  associated  with  predicting  a 

background  value  x,  . 

2*2. 

Although  and  arise  from  a  growing  predictor  using  only  the  samples 

within  region  S,  they  can  be  approximated  by  tne  residuals  and  variances 

associated  with  a  fixed-order  iinear  pred.ctor  where  samples  both  inside  and 

outside  S  are  used  in  the  prediction.  In  this  forir  lation,  since  the 

2 

background  is  assumed  stationary,  o‘  in  F.ouation  19  is  constant  over  the 
summation  index.  Thus,  an  approximate  significance  test  can  be  written  as 


12 


(20) 


1  r  ~  2 

— j-  i  >  constant 

<J  k-1 

where  e.  is  the  prediction  error  based  on  a  fixed-order  predictor  and  where 

2  2* 
a  -o,  . 
k 

2.3  The  Nonstationary  Problem 

The  above  detection  algorithm  depends  on  knowledge  of  stationary 
background  statistics.  The  significance  test  described  in  the  previous 
section  uses  a  stationary  Gaussian  assumption  where  the  covariance  matrix  K 
is  the  same  at  each  spatial  coordinate  (n,m).  In  reality,  however,  the 
background  statistics  of  an  image  are  changing  and  the  matrix  R  in  Equation  2 
is  generally  non-Toeplitz  and  differs  at  each  (n,m).  In  this  section  we  will 
investigate  the  significance  test  when  the  data  are  nonstationary.  We  shall 
see  that  the  significance  test  changes  little  from  what  we  found  previously. 

Let  us  consider  the  small  region  S  extracted  from  a  1-D  sequence  x(n) 
illustrated  in  Figure  2-6.  We  investigate  the  1-D  case  for  simplicity — 
the  2-D  case  follows  similarly.  The  time-dependence  of  the  statistics  of  the 
sequence  is  represented  through  the  time-varying  covariance  matrix  K(n). 

The  significance  test  associated  with  the  region  S  is  then  given  by 

p(x)  -  - m7o~ - rr>  exP  l"  4  (x  -  m(n))TR(n)(x  -  m(n))]  >  *  (21) 

“  (2v)m  |K(n)|  1/2  2  -  -  ~  - 

where  we  have  further  indicated  the  time-dependence  of  the  data  through  the 
time-varying  mean  vector  ra(n) . 
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Fig.  2-6.  One-dimensional  signal  with  small  region  S. 


Taking  the  logarithm  of  Equation  21,  we  have 

(x-m(n))^K  *(n)  (jx-m(n))  >  -Jln[  (2n)N|K,(n)  j  ]  -  2Hn  X  (22) 

We  want  to  maintain  a  constant  level  of  significance  by  appropriately  vary¬ 
ing  X.  We  can  show  that  a  constant  level  of  significance  can  be  maintained 
such  that  the  significance  test  becomes: 

(jt  -  m(n))TK  *(n)  (k  -  m(n))  >  constant  (23) 

Since  K(n)  is  symmetric  (but  note,  not  Toeplitz),  we  can  again  perform 

T 

an  LDL  decomposition,  which  now  becomes  a  function  of  the  time  variable  n: 

K(n)  -  L(n)D(n)LT(n)  (24) 
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The  significance  test  becomes 


(x-m(n))TL  ^(n)D  ^(n)(L  ^  (n)  )T(x-m(n) ) 


T 

3  e  (n)D 


(n)e(n)  >  constant 


(25a) 


where 


e(n)  “  L  *(n)(£-m(n))  (25b) 

As  before,  it  is  possible  to  show  that  the  transformation  of  Equation  25b 
corresponds  to  successive  orders  of  linear  prediction  where  the  diagonal 
elements  of  D(n)  are  the  prediction  error  variances.  However,  now  the 
prediction  coefficients  embedded  within  j?(n)  are  time-varying  as  well  as 
changing  with  the  growing  order.  This  implies  that  even  when  the  background 
can  be  modeled  by  a  nonstationary  autoregressive  process  of  finite  order,  the 
prediction  coefficients  associated  with  our  small  region  S  are  generally 
always  changing. 

Let  us  investigate  this  more  carefully.  For  example,  consider  a  Pth- 
order  time-varying  predictor,  with  prediction  error  of  the  form: 

P 

e(n)  ■  x(n)  -  l  a(n;k)x(n-k)  (26) 

k“  1 

Then  using  the  projection  theorem,  we  can  solve  for  the  a(n;k)'s  (at  each 

2 

time  sample),  which  minimite  E[e  (n) 1 : 

E(e(n)x(n-*))  -  0  l  “  1,2,...P  (27) 
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Substituting  Equation  26  into  27, 


P 

E[x(n)x(n-£)3  *  £  a(n;k)E[x(n-k)x(n-&) ]  (28) 

k-1 


or 


P 

r(n;k»0,£)  ■  £  a(n;k)r(n;k,£)  (29a) 

k*l 


where 


A 

r(n;k,f.)  «  E[x(n-k)x(n-i) ]  (29b) 

Note  that  r(n  :k,£)  represents  the  correlation  of  values  of  x(n)  for  n<n 
o  o 

since  k  and  £  are  constrained  to  be  non-negative.  Thus,  r(nQ;k,A)  reflects 
only  the  past  of  the  time-varying  correlation  structure  of  x(n). 

Noting  that  the  prediction  error  variance  is  given  as 

,  P 

o  (n)  ■  r(n;0,0)  -  £  a(n;k)r(n;k,0)  (30) 

k«  1 

we  can  combine  Equations  29  and  30  to  obtain  the  time-varying  normal 
equat ions: 


R(n)a(n) 


o2(n) 


(31a) 
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where 


• 

R(n)  -  . 

. .r(n;k,£ ) . . . 

* 

P  + 

- 

• 

— 

‘  2. 

r  1  1 

o  ( 

-a(n; 1) 

0 

-a(n; 2) 

0 

j»(n)  * 

• 

o  (n)  - 

0 

_ -a(n ; P) _ 

• 

-  0 

(31b) 


(31c) 


Refer  back  to  the  small  region  S  that  is  illustrated  in  Figure  2-6  for 
the  l-D  case.  We  want  to  show  that  £(n)  in  F.quation  25  does  in  fact 
correspond  to  the  prediction  error  from  successively  growing  time-varying 
predictors.  We  first  write  the  sequence  of  normal  equations  associated  with 
predicting  each  element  of  S  (we  assume  our  region  S  runs  over  the  interval 
0Cn<N): 
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R(O)aCO)  -  a2(0) 
R( l)a(l)  -  o2(l) 


R(N)a(N)  -  a2(N)  (32a) 

where  "A"  refers  to  a  growing  predictor,  and  the  index  n  refers  not  only  to 
the  location  of  the  sample  being  predicted,  but  also  to  the  order  of  the 
predictor.  However,  if  the  process  of  concern  follows  a  Pth-order 
autoregressive  model,  a,(n)  for  n*P  will  be  of  order  P;  i.e.,  a(n;u)**0  for 
k>P. 

Note  that  R(0) ,  R(l)...  are  matrices  growing  in  size,  but  that  R(n)  is  a 
lower  right-hand  corner  submatrix  of  R(N*1) — in  spite  of  the  fact  that  theae 
matrices  are  non-Toeplitz.  Consequently,  we  can  express  the  sequence  of 
normal  Equations  32a  as 


where  si(n)  represents  the  reversal  on  a(n). 

Finally,  note  that  (when  x(n)  is  zero-mean)  R(n)  ■  K(n)  in  Equation  32b, 
so  that 

a(0) 

L  *  -  ^(1)  0  (33a) 

a(N)  . 


18 


and 


a2(0) 

o2(l)  0 


D  = 


(33b) 


LO 


a2(N)J 


Thus,  e(n)  does  indeed  correspond  to  successive  orders  of  time-varying 
(growing)  predictors,  provided  that  the  process  being  predicted  has  zero 
mean. 

In  the  stationary  case,  we  can  approximate  e(n)  by  time-varying  finite- 
order  predictors.  As  before,  when  the  background  process  follows  an 
autoregressive  process  of  this  finite  order,  only  values  of  ^(n)  near  the 
boundary  of  the  region  S  will  be  inexact.  Our  approximate  significance  test 
based  on  Equation  25  becomes: 

£^(n)D  *(n)£(n)  *  e^(n)D  (n)£(n) 

N 

-  I  e2(k)/o2(k)  (34) 

k-l 

>  constant 


where  e  (k)  and  o  (k)  correspond  to  fixed-order,  but  time-varying  predictors 
(note  that  the  index  k  represents  our  time  index  over  the  region  S). 

Our  approximation  here  may  also  have  certain  computational  advantages. 
For  example,  we  do  not  need  to  recompute  predictor  errors  (corresponding  to 
different  orders)  for  overlapping  regions  S. 
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2.4  Adaptive  Estimation  and  Prediction 


The  significance  test  of  the  previous  section  depends  on  knowledge  of 
the  background  statistics.  We  need  to  know  or  estimate  the  coefficients  of 
the  assumed  time  space-varying  causal  autoregressive  model.  However,  in 
attempting  this  estimation,  we  encounter  the  infamous  uncertainty  principle. 
That  is,  to  obtain  a  reliable  estimate  of  R(n)  in  Equation  31  (which  is 
needed  to  estimate  a(n,k)),  we  require  stationarity  over  a  "sufficiently 
large"  window  size.  On  the  other  hand,  we  as3ume  statistics  are  generally 
changing  everywhere  in  space. 

To  side-step  this  problem,  we  assume  that  the  data  are  stationary  over 
the  estimation  window,  we(n,m).  The  location  of  the  2-D  estimation  window 
that  slides  over  our  data  will  be  designated  by  the  center  index  (n^m^)  as 
illustrated  in  Figure  2-7.  The  model  parameters  associated  with  this  window 
are  defined  as  (n^m^):  a(n^  .m^;  j  ,k)  . 


Pig.  2-7. 


m 

Representation  of  estimation  window. 
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The  estimation  procedure  we  shall  use  is  the  covariance  method  of  linear 
prediction.  The  prediction  error  associated  with  the  estimation  window  at 
spatial  coordinates  (n^,m^)  is  the  error  eCn^.m^)  in  predicting  the  value 
x(n  ,«  )  from  its  causal  neighbors.  Finally,  the  prediction  error  variance 
is  given  by  the  average  squared  prediction  error  under  the 
estimation  window  at  (n^.m^). 

For  each  pixel  location  (n^ra^  we  wish  to  estimate  the  set  of  model 

parameters  a^Q.m^jk,  £)  that  vary  in  space.  To  do  this  we  assume  that  x(n,m) 

is  a  Gaussian  2-D  random  field  stationary  over  each  w  (n-n„,m-m_),  and 

e  0  0 

follows  the  model  in  Equation  17  under  each  estimation  window.  Therefore, 
for  this  section,  we  drop  the  space  dependence  and  work  with  the  model  given 
by  Equation  17. 

We  shall  assume  that  the  prediction  coefficients  a(j,k)  fall  within  a 
(PxQ)  f i'-st-quadrant  plane  mask.  For  simplicity,  we  limit  our  derivations  to 
this  class  of  prediction  masks,  although  it  is  clearly  applicable  to  more 
general  mask  shapes,  such  as  the  nonsymmetric  half-plane  mask.  The  objective 
is  to  estimate  from  x(n,m)  the  model  parameters  a(j,k)  for  j  *  0,1.,., P  and 
k  “  0,1...,Q,  with  j  »  k  *  0.  Further,  assume  we  have  x(n,m)  for 
(n,m)  et-P+Oj tU^lxl-Q+m^ ,m^l  (see  Figure  2-8).  We  then  define  the  error 
e(n,ra)  over  the  region  I,  given  by  I  •  [n  ,n  )xlm.  ,m  ) ,  as 

1  4.  •  a. 

P  Q 

e(n,m)  ■  x(n,m)  £  l  s( j ,k)x(n-j ,m-k)  (n,m)el  (35) 

j-0  k*0 

( j,k)*(O,0) 

Our  goal  becomes  to  minimise  the  sum  of  the  squared  errors  given  by 

n  m 

2  2  9 

Eln0.°0)  "I  l  «  (36) 

n°nj  cpcij 
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The  approach  we  take  is  to  transform  the  ? -D  problem  to  a  1-D  problem  so 
that  a  1-D  least  squares  solution  is  applicable.  Note,  however,  that  we  will 
still  have  solved  the  2-D  least  squares  problem.  We  wish  to  transform 
Equation  36  into  a  1-D  error  expires-.: "'on.  To  accomplish  this  transformation, 
we  define  the  vectors  aln^m^]  and  o  uy 


a[n0,m0] 


”a(0,l)“ 

i 

sCn^ ,m^) 

a(0 ,2) 

sCn^ ,m^+l) 

a(0,Q) 

sCnj,  ,ra2) 

a(l ,0) 

sU^+l  .n^) 

a(i ,1) 

8(n^+l ,m^+l) 

• 

a 

>Q-1)  0  « 

• 

• 

a(l,Q) 

sin^l  ,m2) 

• 

a(P,0) 

s(n2 .m^) 

a(P,l) 

s(n^,m  +1) 

2  1 

1 

Co 

• 

2d 

i _ 

1 

eU2  .m^) 

* 


(PQ—  1 ) 


* 


(37) 


23 


(PQ-D- 


M-i 


M 


(33a) 


whore 


A;= 


PQ-1 


[s(n, )  .  .8(n^+j-0,m^+Q) ]  . . .  ( s(n . +j-P,m^-0) . . s(n^+j-P,ra^-Q) ] 

[8(n|+j-0,io  .  +  1- 1) .  .s(n^+j-0  ,m^+l+Q)  ] . .  .  [  8(n^+j-P ,ra^+l-0)  . .  s(n^+j-P,m^+i  -Q)  ] 


Mtij+j-O.n^-l)  . . 3<n ^+3-0 ,m2~Q)  ]  . . .  [  a(n ^j-P.m^O) .  .sU^-P.m  -Q)  ] 


(38b) 


and  where  we  have  assumed  the  data  segment  I  to  be  of  extent  M  x  M.  Note  that 
a  is  a  veetor  consisting  of  the  concatenation  of  the  rows  of  x(n,m)  over  S; 
a(n  ,«  )  is  a  vector  consisting  of  the  concatenation  of  the  rows  of  a(j,k)  for 
( j ,k)e[0,p]x(0 ,Q]  with  (j,k)  *  (0,0);  and  S  is  s  matrix  that  consists  of  the 
concatenation  of  rows  of  various  subsequences  of  the  known  x(n,m)  required  in 
predicting  each  value  of  x(n,ra)  over  I. 

Therefore,  we  can  write  Equation  36  as: 


n„  m 
2  2 


Kfn^.m^]  ■  l  I  e  (n,m) 


n*n j  to*fUj 


•  (Sa(n0,m0)  -  o)  (Sa(n0,to0]  -  o) 


(39) 


We  then  write  the  solution  [1]  to  minimizing  Equation  39  with  respect  to 
a[n0,m0]  as: 

-1  T 

aln^.m^]  ■  R  So  (40a) 


where 


T 

R  *  S  S  (40b) 

Note  that  the  matrix  R  is  of  extent  (PQ-l)x(PQ-l)  and  difficulty  in  its 
inversion  is  dependent  on  the  model  order,  not  on  the  size  of  the  known  block 
of  data. 

Since  R  is  generally  not  Toeplitz,  its  inversion  will  require  on  the 
3  -1 

order  of  (PQ)  operations.  The  computation  of  R  can  probably  be  reduced  by 
considering  its  block-like  structure  resulting  from  the  conversion  of  a  2-D 

problem  to  a  1-D  problem.  Thus,  assuming  P,Q<<N,  the  bulk  of  the  computation 

T  .  4 

is  embedded  within  forming  R  •  S  S,  which  requires  on  the  order  of  N 

operations. 

This  estimation  is  then  carried  out  at  each  pixel.  An  alternative  to 
this  direct  estimation  is  to  accomplish  the  estimation  recursively.  However, 
this  may  be  a  realistic  alternative  only  when  the  estimation  window  size  is 
less  than  the  model  order  [1};  i.e.,  the  matrix  required  to  be  inverted  at 
each  pixel  is  on  the  order  of  MxM.  We  are  currently  investigating  methods  to 
reduce  this  computation. 

In  either  case,  we  obtain  a  parameter  set  at  each  pixel  which  represents 
an  estimate  of  the  model  parameters  of  the  changing  background,  required  in 
our  prediction  procedure.  Finally,  it  is  straightforward  to  show  from 
Equations  39  and  40  that  the  estimate  of  the  prediction  error  variance  given 
by  the  average  squared  prediction  error  under  each  estimation  window  can  be 
expressed  by 

~2  T  T  T  -2 

0  W  “  0  -alvV  s)atno,mO^N  i^l) 
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2.5  The  Detection  Algorithm 


We  can  now  merge  the  results  of  the  previous  sections  to  form  our  target 

detection  algorithm.  From  our  coefficient  estimates  in  Equation  40,  we  can 

compute  the  prediction  error  function  e  (n,m)  based  on  a  fixed-order,  but 

~2 

time-varying  prediction  model.  Then  with  a  (n^,!^)  in  Equation  41,  we  can 
compute  the  2-D  version  of  the  approximate  significance  test  of  Equation  34: 

I  I  e^(k,i)/o^(k,i)  >  constant  (42) 

MeS 

where  we  can  think  of  the  indices  k  and  l  as  running  over  different 
regions  S. 

Equivalently,  we  can  consider  generating  the  statistics  in  Equation  42 

at  each  spatial  location  (n,m)  of  an  image  by  convolving  an  N  x  N  smoothing 

window  w  (n,m)  with  the  normalised  prediction  error  to  create  a  new  smoothed 
s 

function  E  (n,m) : 

3 


E  (n,m)  «*  a(n,m)**w  (n,m) 
s  s 


where 


q(n,m) 


~2  ~2 
e  (n,m)/o  (n,m) 


(43a) 


(43b) 


In  Che  estimation  of  the  model  parameters,  the  estimation  window 
w^(n,ra)  should  be  small  enough  to  preserve  approximate  stationary,  but  large 
enough  to  obtain  a  reliable  estimate  of  the  required  correlation 
coefficients.  The  estimation  window  must  also  be  large  enough  so  that 
anomalies  (i.e.,  targets)  do  not  badly  corrupt  the  correlation  estimates. 
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The  smoothing  window  should  be  small  enough  so  that  small-extent  targets 

are  not  overwhelmed  by  background  in  the  significance  test.  However,  it 

should  also  be  large  enough  so  boundary  effects  in  our  finite-order  model 

assumption  do  not  play  a  significant  role. 

The  overall  detection  algorithm  based  on  the  approximate  significance 

test  is  illustrated  in  Figure  2-9.  The  first  operation  subtracts  an  estimate 

of  the  local  mean  of  x(n,m)  (recall  that  our  significance  test  requires  a 

zero-mean  random  field),  which  is  computed  by  averaging  x(n,m)  under 

W  (n,m).  Under  the  estimation  window,  a  local  covariance  matrix  R(n,m)  as 
e 

defined  in  Equation  40b  is  computed.  R(n,m)  is  then  used  to  find  a(n,m)  and 

o^Cn.m),  which  are  required  to  compute  the  normalized  prediction  error, 

q(n,m).  Finally,  q(n,m)  is  convolved  with  the  smoothing  window  w  (n,m)  and 

8 

compared  to  a  threshold. 

2.6  Examples 

In  this  section,  we  present  nine  examples  based  on  the  detection 

algorithm  developed  in  the  previous  sections.  Throughout  this  section,  the 

estimation  window  w^  (n,m)  is  of  size  10*10  pixels,  which  we  assume  is 

sufficiently  larger  than  the  size  of  most  targets.  This  assumption  can  be 

justified  through  our  empirical  observation  that,  in  most  cases,  the 

CCF  [14]  of  our  processed  images  is  relatively  flat;  i.e.,  the  targets' 

presence  appears  not  to  adversely  affect  estimation  of  background  statistics. 

We  also  assume  that  a  10*10  window  is  large  enough  to  obtain  a  good  estimate 

2 

of  the  correlation  coefficients  required  in  estimating  a[n,ra]  and  a  (n,ra), 
but  also  small  enough  to  maintain  approximate  stat ionar ity .  Of  course,  this 

assumption  breaks  down  at  region  boundaries. 

In  the  first  examples,  we  consider  computer-generated  1-D  and  2-D 
signals  determined  by  exciting  all-pole  filters  with  white  noise.  We  then 
analyze  progressively  more  complicated  real  images  that  were  obtained  from 
the  RADC  data  base. 
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2-9.  Detection  algorithm. 


Example  1 


Consider  a  sequence  x(n)  of  the  form: 

x(n)  ■  0.95  x(n  -  1)  +  w(n)  (44) 

where  w(n)  is  zero-mean  white  noise.  A  sample  function  of  x(n)  is  shown  in 

Figure  2-10(a),  and  a  1-point  "object"  at  n  ■  64  is  shewn  in  Figure  2-10(b). 

The  single  coefficient  estimate  was  baaed  on  a  16-poinc  estimation  window. 

~2 

Figure  2-10(c)  shows  the  squared  prediction  error  e^.  The  object  is  clearly 
detected. 

Consider  a  second  sequence  depicted  in  Figure  2-ll(a)  of  the  form  in 
Equation  44  created  with  a  different  white-noise  input.  A  four-point  object 
has  been  implanted  at  locations  n  *  90,  91,  92,  and  93.  As  before,  the 
single  coefficient  estimate  was  based  on  a  16-point  estimation  window.  The 
squared  prediction  error,  illustrated  in  Figure  2-ll(b),  gives  a  clear 
indication  of  the  object. 

Example  2 

Figure  2-12(a)  depicts  a  1-D  slice  of  an  aerial  photograph  with  a  one- 
point  object  implanted  at  n  ■  64.  In  particular,  this  1-D  slice  was 
extracted  from  a  section  of  the  aerial  photograph  which  consisted  of  a  grove 
of  trees.  In  this  example,  a  two-parameter  noncausal  model  was  assumed: 

x(n)  »  a^  x(n-l)  +  a^  x(n  i)  +  w(n)  (45) 

where  w(n)  is  white  noise.  The  squared  prediction  error,  shown  in 
Figure  2-12(b),  clearly  picks  out  the  object.  A  second  object  and  its 
corresponding  squared  prediction  error  are  shown  in  Figures  2-l3(a)  and  (b), 
respective iy. 
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Fig.  2-11.  Detection  of  4-point  object  in  Example  1.  (a)  1-D  random 

sequence  with  object,  (b)  squared  prediction  error. 
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Fig.  2-12.  Detection  of  1-point  object  in  Example  2.  (a)  slice 

of  trees  with  object,  (b)  squared  prediction  error. 
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Fig.  2-13.  Detection  of  4-point  object  in  Example  2.  (a)  slice 

of  trees  with  object,  (b)  squared  prediction  error. 


Example  3 


Consider  a  2-D  sequence  generated  by  the  particular  2-D  difference 
equation  of  the  form: 

x(n,m)  ■  a(0,l)  x(n-0,m-l)  +  a(l,0)  x(n-l,m-0) 

+  a(l,l)  x(n-l,m-l)  +  w(n,m)  (46) 

The  background  sequence  (64x64  pixels  in  size)  was  generated  with 
coefficients  a(0,l)  ■  0.1,  a(l,0)  a  -0.9  and  a(l,l)  *  0.1.  Four  objects  were 
implanted  within  the  image,  all  of  a  constant  level,  but  with  a  variance 
about  equal  to  that  of  the  background.  Moreover,  the  size  and  level  of  the 
anomalies  were  chosen  to  be  visually  difficult  to  detect  from  the  background 
(see  Figure  2-14(a)).  The  model  assumed  in  the  estimation  procedure  is  given 
by  the  generating  process  (Equation  46). 

The  3-D  perspective  and  contour  map  of  the  squared  prediction  error 

e  (n,ra)  are  given  in  Figures  2-15(a)  and  (b).  All  four  objects  are  clearly 

detected,  and  even  the  two  closely  spaced  objects  are  resolved.  This  same 

function,  along  with  the  smoothed  e(n,ro)  (a  3*3  smoothing  window,  w  (n,m), 

8 

was  applied  in  this  example),  are  illustrated  in  Figures  2-14(b)  and  (c) 
after  thresholding.  Figure  2-14(d)  shows  the  prediction  error  variance,  and 
Figures  2-14(e)  and  (f)  show  the  smoothed  normalized  prediction  error--both 
appropriately  threshoided. 

Note  that  two  different  thresholds  are  applied  to  the  smoothed 
normalized  prediction  error.  The  first  resolves  three  of  the  four  objects, 
the  second  resolves  all  four  objects,  but  introduces  false  alarms.  This  is 
due  to  the  inaccuracies  of  the  estimate  of  the  prediction  error  variance, 
which  is  illustrated  in  Figure  2—1 4( d ) .  Ideally,  since  the  background  is 
stationary,  the  estimated  prediction  error  variance  should  be  flat.  However, 
as  seen  in  Figure  2-14(d),  the  estimate  actually  peaks  in  the  region  of 
objects — contrary  to  what  we  would  hope  to  happen.  We  have  encountered  in 
this  synthetic  example,  perhaps,  what  is  a  fundamental  limitation  in 
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measuring  the  background  prediction  error  variance:  the  presence  of  objects 
can  falsely  increase  the  background  residual  variance.  With  a  priori 
knowledge  that  the  background  prediction  error  variance  is  constant,  we  were 
able  to  improve  detection. 

Example  4 

Figure  2— 16(a)  depicts  a  64x64-pixel  RADC  image  in  which  two  2x2  pixel 
synthetic  objects  (of  constant  level)  have  been  implanted.  This  image  was 
created  by  a  64-to-l  downsampling  and  smoothing  of  the  original  image.  The 
assumed  background  model  is  the  same  three-parameter  model  used  in  the 
previous  example  in  Equation  46.  Figure  2— 16(b)  shows  the  prediction  error; 
Figure  2-16(c),  the  prediction  error  variance;  and  Figures  2-16(d)  and  (e), 
the  smoothed  normalized  prediction  error  (a  4*4  smoother,  w  (n,a),  was 
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applied).  The  processed  part  of  the  image  is  given  within  the  boxed  area. 
Note  that  normalization  of  the  prediction  error  in  this  case  (unlike  the 
previous  example)  has  helped  bring  out  the  object  from  the  more  busy  field 
background. 

Example  5 

Figure  2-17(a)  depicts  a  64x64  pixel  RADC  image  in  which  two  3x3  pixel 
synthetic  object*  (of  constant  level)  hsve  been  implanted.  This  field-tree 
image  was  created  by  a  64-to-l  downsampling  and  smoothing  of  the  original 
image,  In  our  first  attempt  to  detect  the  two  synthetic  objects,  the  three- 
parameter  model  of  Equation  46  was  assumed.  Although  the  object  in  field 
background  was  easily  detected,  the  object  in  tree  background  was  not 
detected,  even  with  normalisation  by  the  prediction  error  variance. 

Consequently,  in  our  second  attempt  at  detection,  we  assumed  a  twelve- 
parameter  nonsyometric  half-plane  autoregressive  model  (11}.  This  model  ia 
more  general  and  thus  more  likely  to  accurately  model  the  background  (11). 
Figure  2— i 7(b)  show*  the  prediction  error;  Figure  2-l7(c),  the  prediction 
error  variance;  and  Figures  2— 17(d)  and  (e),  the  smoothed  normalized 
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Fig.  2-17.  Detection  of  objects  in  RADC  image  for  Example  S. 


(a)  image  with  two  synthetic  objects,  (b)  prediction  error, 
(c)  prediction  error  variance,  (d)  smoothed  normalised 


prediction  error  (high  threshold),  (e)  smoothed  normalised 


prediction  error  (low  threshold). 


ii  (*uimno  mxtD 


prediction  error  (a  4*4  w  (n,m)  was  applied).  Because  of  the  computational 
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intensity  with  a  twelve-parameter  model,  only  the  designated  region  was 
processed.  Note  that  normalization  of  the  prediction  error  has  helped 
significantly  in  bringing  out  from  the  background  the  object  embedded  within 
the  trees. 

Example  6 


The  RADC  image  displayed  in  Figure  2-18(a)  consists  of  128x128  pixels 
and  was  created  by  a  16-to-l  downsampling  and  smoothing  of  the  original 
image.  The  assumed  background  model  is  the  same  three-parameter  model  used 
in  Example  3  in  Equation  46.  Figure  2-18(b)  shows  the  smoothed  normalized 
prediction  error;  Figure  2-18(c),  the  prediction  error;  and  Figure  2—1 8(d), 
the  smoothed  prediction  error — suitably  thresholded.  As  in  our  synthetic 
example,  the  smoothed  prediction  error  without  normalization  yields  fewer 
detected  objects  (which  may  or  may  not  be  considered  false  alarms)  than  the 
smoothed  normalized  prediction  error.  This  happens  probably  because  the 
background  variance  appears  reasonably  constant  throughout  the  image.  The 
objects,  however,  can  potentially  introduce  a  false  increase  in  the  local 
variance,  as  illustrated  in  Figure  2-18(e),  which  shows  a  thresholded  version 
of  the  prediction  error  variance. 

Example  7 

The  RADC  image  displayed  in  Figure  2-19(a)  consists  of  128x128  pixels 
and  was  created  by  a  16-to-l  downsampling  and  smoothing  of  the  original 
image.  As  in  the  previous  example,  a  three-parameter  autoregressive  model  is 
assumed.  Figures  2— 1 9C b )  to  (e)  make  the  same  comparisons  among  the  various 
residuals  as  made  in  Example  6. 

Example  8 

The  RADC  image  displayed  in  Figure  2-20(a)  consists  of  128x128  pixels 
and  was  created  by  a  64-to-l  downsampling  and  smoothing  of  the  original 
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Fif>.  2-18.  Comparison  of  smoothed  predict  io.  error  and  smoothed 
normalized  prediction  error  for  Example  6.  (a)  RADC  im.iRe, 

(b)  smoothed  normalized  prediction  error,  (c)  prediction  error, 
(d)  smoothed  prediction  error,  (e)  prediction  error  variance. 
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Fig.  2-20.  Comparison  of  prediction  error  and  smoothed  normalized 
prediction  error  with  first-quadrant  mask  for  Example  8.  (a)  RADC 

image,  (b)  prediction  error  variance,  (c)  prediction  error,  (d)  smoothed 
normalized  prediction  error. 
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image.  In  this  example,  we  consider  a  first-quadrant  causal,  second-quadrant 
causal  autoregressive  model,  and  average  of  the  two.  This  average  represents 
an  attempt  to  eliminate  the  directionality  of  the  approximate  significance 
test. 

Figures  2-20  and  2-21  illustrate  the  results  with  first-quadrant  (three- 
parameter)  and  second-quadrant  (three-parameter)  prediction  masks, 
respectively.  Figure  2-22  summarizes  our  results  by  depicting  the  smoothed 
normalized  prediction  errors  and  their  average.  Note  that  the  individual 
smoothed  normalized  prediction  errors  do  well  in  detecting  most  objects, 
while  the  average  appears  to  deteriorate  the  performance. 

Two  additional  experiments  that  were  performed  with  this  dv.ta  are  shown 
in  Figures  2-23  and  2-24.  Figure  2-23  shows  a  different  thresholded  version 
of  the  CCF  [14]  corresponding  to  the  prediction  of  Figure  2-20.  Th^  CCF 
bears  little  resemblance  to  our  prediction  errors.  Moreover,  due  to  the 
large  estimation  window  (i.e.,  10x10  pixels),  this  function  is  small 
everywhere — reflecting  little  sample-to-sample  change  in  the  coefficient 
estimates.  Finally,  in  Figure  2-24,  we  depict  the  smoothed  noncausal 
normalized  prediction  error.  The  noncausal  prediction  t  ask  is  an  eight-point 
nearest  neighbor  mask.  The  results  on  this  image  and  others  (e.g,,  Example 
5)  are  encouraging,  but  appear  to  do  no  better  (and  perhaps  worse)  than  the 
causal  prediction  masks. 

Example  9 


Consider  the  64x64  pixel  RADC  image  in  Figure  2-25,  generated  by  down¬ 
sampling  the  original  image  by  16-to-i  with  smoothing.  This  image  is 
particularly  interesting  because  of  the  presence  of  a  radio  tower  in  the 
lower  right-hand  corner  of  the  image.  Note  that  the  top  of  the  radio  tower 
has  been  clearly  detected. 

It  is  interesting  to  observe  i.hot  in  Examples  7,  8,  and  9  normalization 
of  the  prediction  errors  helped  detection  arJ  reduced  false  alarms  by 
reducing  the  background  va- iance  in  busy  regiono  such  as  the  tree  and  brush 
areas. 
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Fig.  2-21.  Comparison  of  prediction  error  and  smoothed  normalized 
prediction  error  with  second-quadrant  mask  for  Example  8.  (a)  RADC 

image,  (b)  prediction  error  variance,  (c)  prediction  error, 

(d)  smoothed  normalized  prediction  error. 
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Fir.  2-22.  Average  of  smoothed  normalized  prediction  errors 
for  Example  8.  (a)  RADC  image,  (b)  smoothed  normalized 

prediction  error  (1st  quad),  (c )  smoothed  normalized 
prediction  error  (2nd  quad),  (d)  average  of  (b)  and  (c). 
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Fig.  2-23.  CCF  corresponding  to  Figure  2-20. 


Fig.  2-24.  Smoothed  normalized  noncausal  prediction  error  for  Example  8. 
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Kip,.  2-25.  Comparison  of  prediction  error  ami  smoothed  norma  1  i  zed 
prediction  error  for  Example  9.  (a)  RADC  imape,  (1>)  prediction 

error  variance,  (c)  prediction  error,  (d)  smoothed  normalized 
predict  ion  error . 
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3.  ADAPTIVE  CONTRAST  ENHANCEMENT  PROGRAM 


We  have  developed  two  algorithms  for  adaptive  contrast  enhancement  of 
images  degraded  by  cloud  and/or  shadow.  The  first  algorithm  is  a  simple  gain 
in  local  contrast  [7],  and  the  second  is  more  complex  and  sophisticated — 
the  adaptive  homomorphic  algorithm  [8) .  A  computer  program  system  that 
implements  these  techniques  was  written  in  'C'  language  under  the  UNIX 
operating  system.  The  adaptive  contrast  enhancement  system  offers  more 
operation  modes  than  just  the  two  techniques  specified  above.  The  following 
sections  describe  in  detail  all  the  options  that  the  system  offers  to  the 
user.  The  system  consists  of  two  programs:  image_enhanc .c  and  scale. c; 
these  are  described.  In  addition,  an  actual  example  of  how  to  use  the 
program  is  given. 

The  system  described  was  installed  in  RADC's  AFES  system  and  a  version 
of  it  exists  on  our  computer  (VAX  with  UNIX  as  an  operating  system). 

3.1  Description  of  image_enhauc .c 

Program  image^enhanc .c  implements  the  adaptive  contrast  enhancement. 

The  program  offers  three  operation  modes: 

1.  intensity  domain 

2.  density  domain  (log)  on  a  positive  image 

3.  density  domain  (log)  on  a  negative  image 

After  selecting  one  of  the  three  modes,  the  user  can  choose  to  process  the 
resulting  array  ffn^nj)  in  two  different  ways: 

1.  point-by-point  (done  in  the  space  domain) 

2.  block  processing  with  50  percent  overlap  (done  in  the 
frequency  domain) 
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3.1.1  Point-by-Point  Processing 
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As  shown  in  Figure  3-1,  the  point-by-point  processing  separates  the 
input  array  into  two  components:  the  local  mean  (f.  )  and  the  local  contrast 

<V- 

The  local  mean  is  computed  as  the  weighted  average  of  the  input  array 
over  a  small  2-D  window  (W),  i.e.. 


-  }).r  : 

I  th  '  ‘  o', 

If  4  ‘  ■ 


♦hwaize  n.+hwsize 

1  ^  ^ 

fL(nl,n2)  "  U  ’  I  y 

ra^n^-hwsize  m2*n2-hwsize 


•  W(m1-n1  ,m2-n2) 


(45a) 


+  *H<n V  h2) 


<H(n,.  n2) 
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+  0<n,.  n2) 


where 


hwsize  hwsize 

U  -  l  7  W(L,j)  (45b) 

i“-hwsize  ja-hwsize 

2 

The  size  of  the  filter  W  is  (2  hwsize+1)  .  W  can  be  a  rectangular  window  or 
a  Gaussian  window. 

The  local  contrast  is  obtained  by  subtracting  the  local  mean  (f  )  from 

L 

the  value  of  the  point  (n^.n^),  i.e., 

fR^nl,rV  *  f^ni»n2^  “  fL(°l,n2)  U6) 

After  separating  the  signal  into  its  two  components,  the  enhancement  ia 
achieved  by  modifying  each  component  according  to  a  function  (which  is 
defined  by  the  user)  of  the  local  mean  of  the  intensity  values  —  DCin^.n^), 
i.e. , 

n^hwsize  i^+hwsize 

DC(uj ,0^)  ■  -  y  l  s(i,j)  W(i-n^ , j-n^)  (47) 

i",Oj-hwaize  j-n^-hwsize 

where  s(i,j)  is  the  value  of  the  point  (i,j)  in  the  input  image  (intensity 
domain) . 

The  local  contrast  is  multiplied  by  a  gain  k  (.),  which  is  a  function  of 
DC(n^,n^).  If  k(OC(n ^ tn^) 1>1 ,  the  contrast  vs  increased;  if  it  ia  less 
than  1,  the  contrast  is  decreased.  The  function  k  is  defined  as  piecewise 
linear.  The  new  contrast  f„(nt,n,)  ia  defined  as 

H  1  2 

^nl*n2^  "  VVV  (48) 

The  local  mean  goes  through  a  nonlinearity;  more  specifically,  the  new  moan 
f '  (n.  ,n  )  is  defined  as 

Is  1  2 

f^(nrn2)  *  (f^inj  ,»2)-midrsng)  •  tt DC<n  j  »n2>  1  *  midr*n8  (49) 


6 


where  raidrang  is  the  value  of  the  midpoint  in  the  range  of  values  in  the 
input  array  (in  intensity  processing  it  is  128  and  in  density  processing  it 
is  420.51og  (255+0)).  The  l  function  ranges  between  zero  and  one  and  is 
defined  as  piecewise  linear.  Figure  3-2  shows  the  way  the  new  local  mean  is 
generated. 


Fig.  3-2.  Local  mean  processing. 

We  would  like  to  change  the  local  mean  so  it  will  be  closer  to  the 

raidrang.  This  is  done  for  two  reasons.  The  first  is  to  compensate  for  the 

degradation  (clouds  +  high  DC,  shadows  +  low  DC).  The  second  is  to  account 

for  the  increase  in  contrast  that  is  achieved.  The  two  new  components  if' 

L 

f^)  are  combined  to  produce  the  processed  array  gCn^^).  The  different 
parameters  are  described  in  the  next  section. 

Parameters 


The  user  has  cor.,  col  over  eight  parameters  in  the  point-by-point 
Drocessing.  The  parameters  are  described  below,  and  actual  values  are  given 
in  Table  3-1. 
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TABLE  3-1 

PARAMETERS  FOR  POINT-BY-POINT  PROCESSING 


Parameter 

— 

Defaults* 

Range  of 
Normal  Values 

wtype 

g 

g  or  r 

hwsize 

5 

<15 

var 

6. 

<2  hwsize+1 

numpt 

6 

<21 

k(0j , . . . ,k[numpt-l] 

2.,  A. ,  5.,  7.,  8.,  8. 

12.>k[ i]> 1 . 

A  [0]  ... .  ,Mnurapt-l  j 

C.2.  ,  0.2,  0.2,  0.2,  0.2,  0.2 

l.>4[i]>0. 

mn(0] , . . *mn[nu«pt-l) 

0.,  100.,  150.,  175.,  255., 

255. 

xmax>mn[ i ] >0 

xnutx 

255. 

♦Defaults  exist  only  for  intensity  processing  of  a  cloudy  image. 


1. 


2. 


wtype  -  Shape  of  window  to  be  used  in  computing  the  local  mean, 
g  -  Gaussian  window 
r  -  rectangular  window 

var  -  Thickness  of  Gaussian  window  if  chosen  (see  Figure  3-3) . 


|m39»  *1 


Fig.  3-3.  Thickness  of  Gaussian  window. 


3. 


4. 


5. 

6. 
7. 


hvsize  -  (2  hwsize+l)  is  the  extent  of  that  window  (max  31  *  31). 

numpt  -  number  of  points  in  the  piecewise  linear  functions  (k,i) 

(maximum  21  points). 

k  -  the  array  that  contains  the  values  of  the  k  function  (max 

size  21). 

4  -  the  array  that  cootains  the  values  of  l  (max  size  21). 

mn  -  the  array  that  contains  the  corresponding  mean  values  of 

the  k  and  i  functions,  i.e.. 

Remark 

oo(0)  is  set  to  0  (for  the  beginning  of  the  range) 

mn(numpt-l)  13  set  to  the  maximum  gray  level  allowed  (usually  255.) 
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8.  xmax  -  maximum  gray  level  in  the  input  image  (usually  255.) 
(see  Figure  3-4). 


Fig.  3-4.  The  k  function  described  as  a  piecewise  linear  function 
of  the  mean  value  ran. 

3.1.2  Block-by-Block  Processing 

The  block-by-block  processing  (Figure  3-5)  uses  a  2-D  triangular  window, 
,  with  an  overlap  of  SOX  to  window  the  data.  Along  one  dimension,  this  looks 
like  the  sketch  in  Figure  3-6. 

A  high-pass  filter  is  applied  to  each  windowed  section.  The  exact  shape 

of  the  high-pass  filter  is  determined  according  to  the  value  DC^,  which  is 

the  average  value  (using  a  rectangular  window)  of  the  current  input  image 

(intensity  domain)  section.  The  exact  shape  of  the  filter  is  defined  by 

three  parameters  (A,  B,  and  C  shown  in  Figure  3-7)  which  are  given  for  DC  *  0 

and  DC  *  max.  allowed.  For  an  intermediate  value  of  DC.,  each  of  the  three 

v 

parameters  is  determined  by  the  formula: 
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X. 

1 


(XDC,x  -  W2  DCi2  , 
2 

(max  DC  al Lowed) 


^C-O 


for  X  *  A,  B,  or  C 


(50) 


The  filter  is  a  Gaussian-shaped  filter  as  shown  in  Figure  3-7.  The  following 
section  describes  in  detail  the  various  parameters  of  the  processing.  The 
parameters  are  listed  in  Table  3-2. 


TABLE  3-2 

PARAMETERS  FOR  BLOCK  PROCESSING 

Parameter 

Default* 

Range  of  normal  values 

hwaize 

8 

4,8,16 

xraax 

255. 

hfdcO 

1.1 

hfdcmax 

1.5 

£fdcO 

0.53 

£  fdcmax 

1.2 

vardcO 

5. 

vardcmax 

20. 

♦Defaults  are  designed  for  negative  density 
processing  of  a  cloudy  image. 

Parameters 


1.  hwsize  -  (2  hwaize)  is  the  size  of  the  processed  section  and  the 
filter  that  is  applied  (up  to  32  *  32  for  512  *  512  images 
and  up  to  16  *  16  for  1024  *  1024  images).  It  should  be  a 
power  of  two. 
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2. 

xmax 

-  Max  gray 

level 

255). 

3. 

hfdcO 

-  Value 

of 

A  (in 

4. 

hfdcmax 

-  Value 

of 

A 

for 

5. 

l  fdcO 

-  Value 

of 

C 

(in 

6. 

Adcmax 

-  Value 

of 

C 

for 

7. 

vardcO 

-  Value 

of 

B 

(in 

8. 

vardcmax 

-  Value 

of 

B 

for 

that  exists  in  the  current  image  (usually 

Figure  3-7)  for  mean  level  ■  0  (the  lowest) . 
mean  level  =  xmax. 

Figure  3-7)  for  mean  level  ■  0. 
mean  level  *  xmax. 

Figure  3-7)  for  mean  level  *  0. 
mean  level  ■  xmax. 


3.1.3  Resulting  Output 


After  the  point-by-point  or  block  processing,  the  resulting  output  array 
is  exponentiated  if  it  is  a  density  image.  Inversion  of  negative  mode  images 
can  be  done  in  the  program  scale. c. 


3.2  Description  of  scale. c 


For  some  of  the  processing  modes  allowed  in  image_enhanc ,c  (such  as  all 
density  processing  and  intensity  block  processing),  the  resulting  output 
array  does  not  lie  in  the  required  range  for  display  (0-255).  Thus,  the 
output  array  must  be  scaled.  Since  the  AFES  autojfndw.h  software  permits 
only  one  pass  over  the  data  and  scaling  must  be  done  in  two  passes,  this  led 
to  the  need  for  the  scale. c  program  to  be  run  after  image_enhanc.c  for  the 
specified  modes  of  operation. 

The  program  scales  the  output  array  of  loage_enhanc .c  according  to  a 
number  of  parameters  that  are  provided  by  image  enhanc.c  in  a  file  (see 
running  instructions).  The  parameters  are: 

1.  minv  -  minimum  value  in  the  array  to  be  scaled. 

2.  maxv  -  maximum  value  in  the  array  to  be  scaled, 

3.  pnt  -  point  or  block  processing. 

4.  den  -  density  or  intensity  mode. 

5.  nega  -  negative  or  positive  mode. 
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According  to  these  values,  the  array  is  scaled  in  the  following  manner: 

1.  For  block  processing  of  intensity  mode  or  for  any  processing  of 
positive  density  mode,  scale  the  input  array  such  that 

rainv - >  0 

maxv - >  255 

2,  For  any  processing  of  negative  density  mode,  scale  such  that 

minv - >  255 

maxv - >  0 


3.3  Running  Instructions 

The  following  sections  describe  the  interaction  between  the  user  and 
each  one  of  the  programs. 

3.3.1  image_enhanc.  (the  load  module  of  image_enhanc.c) 

User:  image_enhanc  <imagel ,data>  < image 2 .data> 

where 

imagel.data  is  the  name  of  the  input  image. 
image2.data  is  the  name  of  the  output  image. 

Terminal:  do  you  want  to  process  in  INTENSITY  or  DENSITY  domain? 

intensity  -  i,  density  -  d 
User:  i  (or  d) 

If  density  domain  { 

Terminal:  do  you  want  to  process  the  NEGATIVE  or  POSITIVE  image? 

User;  p  (or  n) 

) 
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Terminal:  do  you  want  a  POINT-by-POINT  processing  or  a  BLOCK 
processing? 
block  -  b,  point  -  p 
User:  b  (or  p) 

If  POINT  processing  was  selected: 

Terminal:  do  you  want  to  change  the  defaults?  The  defaults  exist  only 
for  intensity  processing  of  a  cloudy  image, 
yes  -  y,  no  -  n 
User:  n  (or  y) 

For  density  processing  { 

Terminal:  enter  filename  for  scaling  parameters  to  be  used  by  scale 

program 

User:  param.data 

} 

If  the  defaults  were  selected,  the  processing  starts  with  no  additional 
interaction  with  the  user  for  this  program.  If  the  defaults  were  not 
selected,  the  dialogue  continues. 


Terminal: 


User: 

Terminal: 

User: 

Terminal: 


enter  hwsize**half  size  of  window  to  compute  mean  value  -  n 
(2n+l)  (2n+l).  n  is  limited  up  to  15  for  a  512  *  512  image  and 
up  to  7  for  a  1024  x  1024  image, 
value  of  hwsize  (integer). 

maximum  gray  level  in  the  image  (usually  255)? 
value  of  xcoax  (float), 

the  following  inputs  define  the  k  and  i  functions  (as  functions  of 
the  mean  value),  k  -  multiplies  the  local  contrast  >“1  and  i-  -  is 
the  nonlinearity  that  is  applied  to  the  local  mean  and  is 
0,“<i<“l.  If  t*l,,  then  the  original  local  mean  is  kept.  If 
lm0.,  then  the  new  local  mean  is  exactly  the  midrange  of  levels 
allowed. 
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NUM  OF  POINTS  IN  FUNCTIONS  (up  to  21)? 


User: 

Te  rrainal : 
ser : 

Terminal: 

User: 

Terminal: 

User : 

Terminal: 

User: 

If  Gaussi 
Terminal: 
Uaer : 

Processing 

If  BLOCK 

Terrains  1 : 

User: 


first  point  must  be  for  mean  value»0.  and  last  must  be  for  mean 
value=xmax. 

value  of  numpt  (integer) 

contrast  amplif ierak[0] ,  nonlinearity  ■  &(0]  for  mean  value«0. 
values  of  the  contrast  amplifier  and  nonlinearity  (both  float)  for 
mean  value=0. 

mean  value=mn[  ],  contrast  araplif ier“k[  ], 
nonlinearity  =  H [  ]  for  j*...  j=l,... 

. . »numpt-2 

values  of  the  jth  mean  value  (float),  jth  contrast  amplifier 
(float)  and  the  jth  nonlinearity  (float). 

contrast  amplifier*k[numpt-l) ,  non 1 i near ity*f.  [numpt- 1]  for  mean 
value^xmax. 

values  of  contrast  amplifier  and  nonlinearity  (both  float)  for 
mean  value^xmax. 

Gaussian  window  -  g  or  rectangular  -  r  for  calculating  the  local 
mean  value? 
g  (or  r) 

i  window  was  selected  { 
thickness  (variance)  of  Gaussian  window? 
value  of  variance  (float). 

} 

starts  with  no  additional  interaction  with  the  user. 


roce8sing 


was  selected: 


do  you  want  to  change  the  defaults?  The  defaults  exist  only  for  a 
negative  density  processing  of  a  cloudy  image, 
yea  -  y,  no  -  n 
n  (or  y) 
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Terminal: 

User: 


enter  filename  for  scaling  parameters  to  be  used  by  scale  program 
parara.  data 


If  the  defaults  were  selected,  the  processing  starts  with  no  additional 
interaction  with  the  user  for  this  program.  If  the  defaults  were  not 
selected,  the  dialogue  continues. 


Terminal: 

User: 

Terminal: 


User: 

Terminal: 

User: 

Terminal: 

User: 

Terminal: 


User: 


maximum  gray  level  in  the  image  (usually  255.)? 
value  of  xmax  (float) 

mapping  variables  HFDCO,  HFDCMAX  -  the  values  of  the  high-pass 
filter  at  the  largest  frequency  for  mean  values  0.  and  xmax 
(maximum  gray  level), 
values  of  hfdcO  and  hfdcraax  (float). 

mapping  variables  LFDCO,  LFDCMAX  -  the  values  of  the  high-pass 
filter  at  Wa(0,0)  for  mean  values  0.  and  xmax. 
values  of  JlfdcO  and  ifdcmax  (float). 

mapping  variables  VARDCO,  VARDCMAX  -  the  thickness  of  the 
Gaussian  high-pass  filter  for  mean  values  0.  and  xmax. 
values  of  vardcO  and  vardcmax  (float). 

enter  hwaize^half  window  site  for  the  adaptive  filtering 
(8,16,32..)  -  a  number  that  is  divisible  by  the  row  and  column 
length  of  the  image,  hwsize  is  limited  up  to  16  for  a  512  *  512 
image  and  up  to  8  for  a  1024  *  1024  image, 
value  of  hwsiae  (integer). 

End  of  interaction. 


Specific  xampte: 


User: 
Terminal : 

User: 

Terminal: 

User: 


image  enhanc  cloud. data  procc loud  .data 

do  you  want  to  process  in  INTENSITY  OR  DENSITY  domain? 
intensity  -  i,  density  -  d 
i 

do  you  want  a  POINT-by-POINT  processing  or  BLOCK  processing? 
block  -  b,  point  -  p 
P 
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Terminal: 


User : 
Terminal : 


User: 
Terminal : 
User : 
Terminal : 


User : 
Terminal : 
User: 
Terminal : 


User: 
Terminal : 


User: 
Terminal : 


do  you  want  to  change  the  defaults?  The  defaults  exist  only  for 
intensity  processing  of  a  cloudy  image, 
yes  -  y,  no  -  n 

y 

enter  hwsize*half  size  of  window  to  compute  ;tean  value  -  n 
(2n+l)(2n+l).  n  is  limited  up  to  15  for  512  x  512  image  and  up  to 
7  for  a  1024  x  1024  image. 

5 

maximum  gray  level  in  the  image  (usually  255.)? 

255. 

the  following  inputs  define  the  k  anu  l  functions  (as  functions  of 
the  mean  value),  k  ■  multiplies  the  local  contrast  >"1.  and  l  - 
is  the  nonlinearity  that  is  applied  to  the  local  mean  and  is 
0."<4Ol.  ,  If  £“1.,  then  the  original  local  mean  is  kept.  If 
£^0.,  then  the  new  local  mean  ia  exactly  the  tnidrang  of  levels 
allowed. 

NUH  OF  POINTS  IN  FUNCTIONS  (up  to  21)? 

first  point  must  he  for  mean  value*0.  and  laat  must  be  for  mean 
value«xmax. 

3 

contrast  aapl i f ier«k(0] ,  nonlinearity»f(0]  for  mean  value»0. 

2.  0.7 

mean  value*on(  ),  contrast  aapl if ier«k{  J,  nool  meant v*  1  for 
i  *  1 

150.  4.  0.5 

contrast  ampl if  ier»k(nuapt-l  J ,  nonl inearity*Mnu»pt-l  1  for  mean 

value*xo**, 

6.  0.2 

Cauacisn  window  -  g  or  rectangular  -  r  for  calculating  the  local 
mean  value? 
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g 

thickness  (variance)  of  Gaussian  window? 

6. 

starts  with  no  additional  interaction  with  the  user. 

Scale  (the  load  module  of  scale. c) 
scale  < image2 .dat a>  <iraage3 .dat a>  <param.data> 

where : 

iraage2.data  is  the  name  of  the  ouput  file  of  image  enhanc. 
image3.data  is  the  name  of  the  resulting  processed  image, 
param.data  is  the  name  of  the  file  that  contains  information  for 
the  scale  program.  The  filename  was  given  as  a  parameter  in 
iraage_enhanc  program. 

3.4  Installation  instructions 

The  two  programs  were  dumped  on  tape  using: 
tar  c  image_enhanc .c  ecale.c 

To  read  these  files  from  tape,  use 

tar  x  image  enhanc. c  seate.c 

To  compile  and  load: 

image_enhane .c 

CC  or  ♦  add  options  for  using 

scale  .c 


stdi o.h 
math . h 
auto  wrtdw.h 


User: 
Terminal : 
User: 

Processing 

3.3.2 

User: 
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Special  running  instructions:  The  program  image  enhanc.c  requires  a  64K  byte 
memory  just.  Tor  data  arrays;  i.e.,  the  special  setting  of  64K  bytes  for  data 
and  64K  bytes  for  program  should  be  used. 

3.5  Glossary  of  the  Program  Variable  Names 

hfdcO  -  value  of  the  high-pass  filter  at  (*,*)  tor  zero  mean, 

hfdcmax  -  value  of  the  high-pass  filter  at  (it  ,x)  for  mean  value*xmax. 

hwsize  -  half  of  the  window  size, 

k  -  contrast  gain  function. 

I  -  nonlinearity  applied  to  the  mean  value. 

AfdcO  -  value  of  the  high-pass  filter  at  (0,0)  for  zero  mean. 

Afdcmax  -  value  of  the  high-pass  filter  at  (0,0)  for  mean  value=*xmax. 

mn  -  array  of  mean  values. 

numpt  -  number  of  points  in  k,£  and  mn  functions, 
var  -  variance  of  Gaussian  window. 

vardcO  -  thickness  of  Gaussian-shaped  high-pass  filter  for  zero  mean, 

vardcmax  -  thickness  of  Gaussian-shaped  high-pass  filter  for  mean  value-xmax. 

wtype  -  shape  of  window, 

xmax  -  maximum  value  in  the  image. 


4.  MULTI-PROCESSOR  ARCHITECTURES  FOR  IMAGE  PROCESSING 


During  the  last  half  of  FY82,  we  have  been  exploring  several  issues 
related  to  the  development  of  a  processor  architecture  suitable  for  image 
processing  problems.  Typical  image  processing  problems  involve  data  sets 
composed  of  several  hundred  thousand  pixels  (picture  elements)  and  large 
amounts  of  computation  (several  hundred  processor  instructions  per  pixel). 
Depending  upon  the  application,  the  computation  may  have  to  be  carried  out  in 
real-time  at  TV  frame  rates  (30  per  second)  or  fast  enough  to  permit 
comfortable  interaction  for  a  human  operator  (1  second  elapsed  time).  It  is 
clear  that  an  image  processing  architecture  must  be  capable  of  supporting 
rapid  computation  on  large  data  sets. 

In  imagj  processing,  as  opposed  to  other  multi-dimensional  signal 
processing  applications,  most  operations  tend  to  be  local;  that  is,  the 
processing  of  widely  separated  parts  of  the  image  is  independent.  Global 
operations,  such  as  the  2-D  Fourier  transform  of  an  entire  image  where  each 
output  value  depends  on  all  of  the  input  pixel  values,  are  rarely  used  in 
typical  itnpge  processing  operations.  For  this  reason,  it  is  possible  to 
consider  the  implementation  of  most  processing  operations  by  a  set  of 
processors  working  in  parallel  on  different  parts  of  the  image  with  a  minimal 
amount  of  communication.  To  be  truly  useful,  however,  such  a  multi-processor 
architecture  must  be  capable  of  handling  the  few  (but  important)  exceptions 
Chat  may  occur  in  any  particular  application. 

A  very  high-level  diagram  of  the  multi-processor  architecture  we  plan  to 
pursue  is  shown  in  Figure  4-1.  It  consists  of  16  nodal  processors  connected 
by  a  communications  network.  The  number  of  processors  is  somewhat  arbitrary 
from  an  architectural  point-of-view  but  was  sized  based  on  preliminary 
computational  requirements  and  was  chosen  to  be  a  power  of  two  in  order  to 
permit  the  use  of  a  butterfly  cowmunicat ion  network.  The  system  input/output 
(I/O'  port  is  shown  as  being  part  of  the  coraraunicst ion  network,  but  it  may  be 
necessary  to  alter  this  to  provide  the  required  I/O  bandwidth.  A  thorough 
study  of  the  system  I/O  remains  to  be  done. 
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Pig.  4-1.  A  high-level  block  diagram  £or  a  multi-processor  system 


8i 


a 


There  are  several  possibilities  for  the  communications  network.  The 
most  obvious  choice  is  a  high-speed  data  bus.  Typical  busses  used  in 
conventional  computers  have  bandwidths  on  the  order  of  1-2M  bytes/second,  but 
this  is  not  fast  enough  to  satisfy  the  bandwidth  requirements  of  a 
16-processor  system.  It  is  conceivable  that  an  advanced  bus  based  on  RF 
coaxial  cable  or  optical  fibers  could  be  developed  to  satisfy  the  bandwidth 
needs  which  we  estimate  to  be  roughly  100-200M  bytes/second. 

In  FY82  we  have  begun  investigating  the  butterfly  network  structure 
shown  in  Figure  4-2  for  use  ns  the  interprocessor  communication  network.  The 
network  is  shown  unfolded  so  that  processor  outputs  are  located  on  the  left 
and  processor  inputs  are  located  on  the  right.  This  network  allows 
communication  between  any  two  nodal  processors  and  permits  many  sets  of 
conversations  to  take  place  in  parallel.  In  another  project,  a  four- 
processor  system  was  constructed  using  Motorola  MC68000  microprocessors  and  a 
similar  butterfly  network. 

Several  important  system-level  questions  remain  to  be  addressed.  We 
have  already  mentioned  the  question  of  Bystem-level  I/O  to  an  outside  host 
computer  or  data  source.  The  significant  issue  of  how  to  make  the 
programming  (and  controlling)  of  a  multi-processor  system  appear 
straightforward  also  needs  to  be  examined  in  detail.  A  multi-processor 
system  that  is  fast  but  difficult  to  program  will  not  be  very  useful  to  the 
image  processing  community. 

In  Che  following  sections  several  architectural  features  for  a  nodal 
processor  will  be  discussed.  These  features  are  worth  developing  and 
refining  because  they  show  promise  of  helping  achieve  the  dual  goals  of  high 
computational  throughput  and  ease-of-progranming.  The  architectural  study 
is,  however,  far  from  complete,  and  subsequent  ideas  and  developments  may 
alter  or  eliminate  some  of  the  architectural  features  discussed  below. 
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Fig.  4-2,  A  16-proceasor  butterfly  network.  Each  circle  represents  a 
2-input,  2-output  switch  that  can  pass  signals  straight  or  crossed. 
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4.1  Ground  Rules  for  Developing  a  Nodal  Processor  Architecture 


Most  of  our  work  in  this  area  during  the  last  quarter  of  FY82  has  been 
directed  toward  the  architecture  of  a  nodal  processor.  As  a  ground  rule,  we 
have  tried  to  separate  architectural  issues  from  implementational  issues. 

Some  architectural  principles  may  ultimately  have  to  be  compromised  in  the 
interest  of  efficient  and  timely  hardware  implementation.  However,  our 
objective  thus  far  has  been  to  develop  the  best  architecture  possible  without 
any  preconceptions  about  implementation  details.  Furthermore,  the  technology 
to  support  the  underlying  implementation  of  a  particular  architecture  is 
constantly  evolving.  Many  computer  manufacturers  use  the  same  architecture 
(for  software  compatibility)  but  constantly  upgrade  the  implementation  to 
deliver  systems  with  improved  per forraance/cost  characteristics.  With  the 
advent  of  VLSI  and  VHSIC  technology,  implementation  techniques  will 
undoubtedly  change  once  again.  The  development  of  an  image  processing 
architecture  may  influence  future  chip  sets,  thus  permitting  the 
implementation  of  architectural  features  that  may  have  to  be  compromised  with 
today's  technology.  Therefore,  we  have  decided  to  concentrate  initially  on 
developing  the  right  architecture  without  implementational  constraints. 

Processors  can  be  compared  along  many  dimensions:  speed,  efficient  use 
of  memory,  ease  of  programming,  instruction  set  sophistication,  word  length, 
size,  weight,  power  consumption,  reliability,  etc.  For  now,  we  have  taken 
the  attitude  that  speed  and  ease  of  programming  are  moat  important,  that  the 
instruction  set  is  important  as  an  influence  on  the  speed  (minimizing  the 
number  of  instructions  required  for  some  computation)  and  ease  of 
programming,  and  that  "memory  is  cheap."  It  is  permissible  to  use  memory 
inefficiently  if  it  permits  faster  execution  or  simplifies  programming. 

The  easiest  way  (up  to  a  point)  to  build  a  fast  processor  is  to  use  the 
fastest  available  logic  family  for  the  implementat ion.  For  a  fixed 
architecture,  the  basic  machine  cycle  time  will  determine  the  speed  of 
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execution.  There  are  mitigating  concerns,  of  course.  Fast  logic,  in 
general,  implies  more  power  consumption,  more  demanding  board  layout,  lower- 
level  integration,  and  less  reliability.  It  will  probably  be  necessary  to 
compromise  between  raw  logic  speed  and  other  measures  such  as  integration 
level  or  power  consumption. 

4.2  Architectural  Requirements  for  a  Nodal  Processor 

Based  on  the  premise  that  the  nodal  processors  in  a  multi-processor 
image  processing  system  will  have  to  be  fast  computers  in  their  own  right 
that  are  capable  of  handling  large  arrays,  we  can  begin  to  outline  some  of 
the  architectural  requirements  for  the  nodal  processors.  Traditional  high¬ 
speed  array  processor  architectures  have  been  very  "horizontal,"  using 
techniques  such  as  separate  program  and  data  memories;  separate  hardware  for 
address  computation  and  index  register  manipulation;  and  pipelined  fetch, 
decode,  and  execution  of  instructions.  The  nodal  processor  architectures 
that  we  are  now  studying  will  use  many  of  the  same  techniques,  provided  that 
they  do  not  impair  the  ease  with  which  the  machine  can  be  programmed. 

Since  the  nodal  processors  will  be  handling  large  arrays  of  image  data, 
array  access  must  be  very  efficient.  Similarly,  it  is  important  to  provide  a 
mechanism  for  controlling  program  loops  that  accrues  very  little  overhead. 

Many  processor  architectures  are  capable  of  dynamically  allocating  data 
memory  upon  each  invocation  of  a  subroutine,  procedure,  or  function.  This 
capability  permits  re-entrant  subroutines,  efficient  interrupt  handling,  and 
shared  instruction  code  (although  this  is  probably  not  important  for  this 
application).  In  addition,  dynamic  memory  allocation  should  result  in  a  more 
efficient  use  of  the  data  memory  than  static  allocation  would,  since  it 
permits  the  "time-sharing"  of  memory.  Unless  we  uncover  a  sound  argument 
against  it,  dynamically  allocated  memory  will  be  an  architectural  requirement 
for  the  nodal  processors. 
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To  permit  simple,  straightforward  programing,  the  nodal  processor 
architecture  should  efficiently  support  a  high-level  language  such  as  "C". 

The  machine  language  itself  should  be  relatively  high  level  so  that  assembly 
language  programing  (machine  mnemonics  plus  macro-instruction  capability)  is 
conducted  at  a  high  level.  It  is  important  to  isolate  the  programmer  as  much 
as  possible  from  the  particular  implementation  of  the  nodal  processor.  The 
programmer  should  be  concerned  with  specifying  operations  and  data  objects  to 
be  used  as  operands  and  not  with  the  details  of  address  computation  or  index 
register  manipulation. 

The  machine  language  instruction  set  should  be  flexible  so  that  it  can 
be  "tuned"  for  different  applications.  In  particular,  it  should  provide  some 
simple  array  handling  instructions  such  as  clearing  an  array  or  adding  two 
arrays.  A  set  of  a  half-dozen  or  so  such  array  instructions  could  relieve  a 
programmer  of  much  pedestrian  code  generation  in  an  image  processing 
application  and  allow  him  to  concentrate  on  the  important  parts  of  the 
program.  Machine  language  flexibility  could  be  achieved  by  using  micro-code 
to  realize  each  machine-level  instruction.  New  instructions  could  be 
accommodated  by  writing  new  micro-code,  assuming  that  there  are  enough  unused 
op-codes  remaining  in  the  instruction  format.  The  fetching,  decoding,  and 
execution  of  the  micro-instructions  could  be  pipelined  to  increase  the 
effective  speed  of  the  nodal  processor. 

The  nodal  processor  architecture  should  be  flexible  enough  to  include 
special-purpose  computational  devices  that  may  differ  depending  on  the 
application.  For  example,  a  high-speed  multiplier  or  multiplier/accumulator 
could  be  considered  a  "special-purpose"  device,  although  for  image  processing 
and  multi-dimensional  signal  processing  applications  it  is  a  requirement. 
Other  computational  devices  might  include  an  FFT  butterfly,  a  16-point  FFT, 
or  a  CORDIC  rotation  element.  A  special-purpose  device  may  also  be  necessary 
to  facilitate  I/O  with  the  outside  world  as  well  as  the  interconnection 
network.  The  important  point  is  to  allow  sufficient  flexibility  to  permit 
the  nodal  processors  to  be  retrofitted  with  special-purpose  components  to 
help  increase  computational  throughput  for  a  particular  application. 
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The  issue  of  fixed-point  versus  floating-point  arithmetic  has  yet  to  be 
decided.  To  some  extent,  it  is  more  of  an  implementation  issue  than  an 
architectural  issue.  Fixed-point  arithmetic  tends  to  be  faster,  but  new 
single-chip  LSI  floating-point  components  may  close  the  gap.  Fixed-point 
accuracy  and  dynamic  range  can  be  improved  by  going  to  longer  word  lengths, 
with  the  concomitant  increase  in  hardware,  but  then  it  becomes  more  difficult 
to  use  the  single-chip  16-bit  multipliers  currently  on  the  market.  There  are 
advantages  and  disadvantages  to  both  data  representations,  and  the  ultimate 
solution  may  be  the  traditional  one  of  supporting  both.  However,  because  of 
the  horizontal  architecture,  it  may  be  feasible  to  use  fixed-point  arithmetic 
for  addressing,  counters,  loop  control,  and  indexing  and  use  floating-point 
arithmetic  strictly  for  data  computation.  This  would  enforce  the  natural 
separation  in  the  programmer's  mind  between  data  to  be  processed  and 
variables  used  for  data  access  and  program  control. 

4.3  Three-Address  Instructions 

In  many  cases,  execution  of  a  machine  instruction  will  take  two  source 
operands  and  perform  an  operation  on  them  to  produce  a  single  result  to  be 
stored  at  a  particular  destination.  Thus,  three  addresses  need  to  be 
specified  in  a  typical  instruction.  Some  computer  architectures  use  one  of 
the  source  addresses  aa  the  destination  address,  while  more  primitive 
architectures  use  an  accumulator  register  as  one  source  as  well  as  the 
destination.  However,  programs  written  for  these  architectures  usually 
require  a  significant  number  of  LOAD,  STORE,  or  MOVE  instructions  that 
simply  transfer  data  without  doing  any  computation.  The  three-address 
architecture  should  alleviate  some  of  this  overhead,  resulting  in  an 
inherently  faster  nodal  processor.  The  use  of  three-address  instructions  is 
also  consistent  with  the  philosophy  of  a  horizontal  architecture,  permitting 
the  addresses  of  the  souvces  and  destination  to  be  computed  in  parallel. 

There  are  occasions,  however,  when  the  destination  address  is  identical 
to  one  of  the  source  addresses.  Sooe  code  compaction  will  result  if  two- 
address  instructions  are  included  in  the  instruction  set.  On  other 
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occasions,  the  result  of  an  operation  needs  to  be  stored  only  temporarily 
because  it  will  be  an  operand  for  the  next  instruction.  Thus,  it  may  be 
prudent  to  provide  a  stack  of  temporary  storage  registers  to  be  used  as 
easily  accessed  operand  sources  and  destinations.  (Actually,  a  "temporary 
variables"  stack  is  an  implement  at ional  rather  than  an  architectural  issue. 
The  three-address  instructions  do  not  need  a  stack  to  be  useful, 
particularly  if  access  to  data  memory  can  be  made  as  fast  as  access  to  the 
stack.) 

4.4  Array  Instructions 

For  image  processing  as  well  as  other  array  processing  applications,  it 
seems  prudent  to  provide  the  nodal  processor  architecture  with  a  small  but 
powerful  set  of  instructions  for  performing  array  operations.  These 
operations  would  include  element-by-element  addition,  subtraction, 
multiplication,  and  division  of  two  arrays,  and  other  operations  such  as 
element-by-eleraent  maxirauraa  and  rainimums  may  also  be  worth  including.  Other 
candidates  are  inner  products,  sum-of-e  lenient  s,  sum-of-e  lenient  a-squared, 
absolute  value,  and  2-D  convolution. 

These  array  instructions  would  be  implemented  in  micro-code  to  utilize 
the  maximum  speed  advantage  of  the  architecture.  It  should  also  be  possible 
to  implement  other  array  instructions  in  micro-code  for  particular 
applicat ions. 

4.5  Data  Addressing 

An  underlying  assumption  of  the  nodal  processor  architecture  is  that  it 
must  aupport  a  fairly  large  data  memory  (256K  bytes-lH  byte)  for  image 
processing  applications.  For  direct  addressing,  this  implies  addresses  at 
least  20  bits  in  length,  and  perhaps  24  bits  or  even  32  bits  for  future 
expansion  of  the  data  memory.  With  three-address  instructions,  the  number  of 
bits  needed  for  specifying  source  and  destination  locations  thus  ranges  from 
60  to  %  bits,  resulting  in  very  wide  instruction  words.  To  keep  the 
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instruction  width  down,  some  computer  architectures  make  use  of  address 
registers.  These  registers,  which  contain  the  addresses  of  the  desired 
operands,  are  few  enough  so  that  they  can  be  specified  with  a  small  number  of 
bits  in  an  instruction  word.  Operands  are  thus  accessed  indirectly. 

However,  address  registers  must  be  saved  and  restored  during  context  switches 
like  subroutine  calls  and  returns  and  interrupt  servicing,  and  this  may  imply 
a  high  level  of  overhead. 

Because  it  fits  in  nicely  with  the  notion  of  dynamic  memory  allocation, 
we  have  been  investigating  a  stack-oriented  data  memory.  Data  in  the  stack 
can  be  accessed  relative  to  the  stack  pointer  (which  points  to  the  top  of  the 
stack),  a  frame  or  environment  pointer  (which  points  to  a  memory  location 
determined  by  the  current  program  context;  see  Section  4.9),  or  a  global 
pointer  (which  for  simplicity  may  be  taken  as  the  bottom  of  memory).  It  may 
also  be  useful  to  provide  other  pointers  into  the  stack  to  facilitate  data 
access.  This  question  is  still  open. 

With  this  structure,  single  data  items  (as  opposed  to  arrays  of  data 
items)  will  be  generally  accessed  by  specifying  in  the  instruction  word  a 
pointer  register  and  a  constant  offset  value  to  be  added  to  the  contents  of 
the  pointer  register  to  compute  the  effective  address. 

4.6  Array  Data  Access 

In  a  typical  computer  architecture,  array  elements  are  accessed  by 
computing  an  effective  addreas,  which  is  the  sura  of  a  base  address  plus  an 
offset  or  index.  For  two-  or  higher-dimensional  arrays  defined  in  a  high- 
level  language,  it  is  necessary  to  explicitly  compute  the  effective  address 
by  repeated  multiplications  and  additions.  When  array  elements  are  accessed 
sequentially  (within  a  loop,  for  example),  much  of  this  addreas  computation 
can  be  eliminated  by  simply  incrementing  the  effective  address  by  the  proper 
amount . 

From  a  programmer ' a  point  of  view,  However,  the  computation  of  an 
effective  address  from  an  array  base  address  and  index  values  is  a  nuisance 
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to  be  handled  by  a  compiler  or  directly  by  the  hardware.  In  image  processing 
applications,  we  expect  that  a  great  deal  of  computation  will  use  two- 
dimensional  arrays  accessed  sequentially  within  loops.  Consequently.,  it  is 
important  that  the  architecture  handle  the  implied  address  computations 
rapidly. 

We  have  been  investigating  two  concepts  for  accessing  array  elements. 

The  simpler  consists  of  providing  the  necessary  arithmetic  capability  to 
compute  effective  addresses  by  adding  the  contents  of  two  address  registers 
while  simultaneously  incrementing  one  of  the  address  registers  by  an  amount 
contained  in  a  third  register.  This  allows  the  base  address  to  be  contained 
in  one  address  register  with  the  periodically  updated  offset  to  be  contained 
in  the  second  address  register.  This  scheme  is  relatively  straightforward, 
but  it  forces  the  machine  language  programmer  to  get  involved  in  keeping 
track  of  effective  addresses  rather  chan  simply  specifying  an  array  name  and 
the  values  of  its  indices. 

As  an  alternative,  we  have  been  exploring  the  concept  of  array  access 
registers  (AARs).  In  its  simplest  form,  an  AAR  contains  information  about  an 
array,  such  as  its  base  address  in  memory,  the  number  of  dimensions,  the 
aaxuuun  number  of  storage  cells  in  each  dimension,  and  the  number  of  bytes  of 
memory  used  to  contain  a  single  storage  cell,  which  will  allow  hardware  to 
convert  a  request  for  a  particular  element  of  a  particular  array  into  an 
effective  memory  address.  Going  one  step  further,  we  can  actually 
incorporate  index  registers,  as  well  as  the  effective  address  corresponding 
to  the  current  values  of  the  index  registers,  into  the  AARs.  This  permits  a 
separate,  noninterfering  set  of  index  register#  for  accessing  each  distinct 
array. 

The  index  registers  can  be  initialised  or  reset  by  machine- level 
instructions  specifying  che  AAR  and  the  new  values  of  the  indices.  In 
addition,  inatroctions  can  specify  that  an  array  index  be  incremented, 
decremented,  or  seroed  out  after  beinR  used.  Hardware  will  be  re#ponaible 
for  computing  a  new  effective  address  from  the  altered  index  (or  indices)  and 
storing  it  back  m  the  AAR  for  future  use.  This  mode  should  be  very 
efficient  for  loops  which  perform  the  same  basic  operation  on  all  element#  of 
an  array. 


The  AAR8  can  be  allocated  dynamically,  :ust  as  data  memory  is.  This 
will  permit  efficient  use  of  AARs  by  effectively  "time-sharing"  them  as 
different  subroutines  become  activo.-  AARs  will  also  permit  arrays  to  be 
passed  as  subroutine  arguments  * n  an  efficient  manner.  If  AARs  are  allocated 
by  a  slack,  however,  it  implies  that  array  data  are  accessed  by  two  levels  of 
indirection,  one  with  respect  to  a  stack  pointer  (or  frame  pointer)  to  get 
the  appropriate  AAR,  and  the  second  to  use  the  effective  address  provided  by 
the  AAR.  We  will  have  to  examine  the  .AAR  concept  carefully  to  see  if  the 
doubly  indirect  access  leads  to  an  unacceptably  slow  •'•rc'ni.tecture . 

In  theory,  scalar-valued  variables  could  also  bn  accessed  using  AARs. 
This  would  provide  a  consistent  architecture  in  that  scalar  and  array 
variables  would  be  accessed  in  a  similar  manner,  with  the  programmer  treating 
each  variable  as  a  data  object  with  certain  attributes.  However,  this  would 
mean  that  scalars  as  well  as  arrays  would  require  two  levels  of  indirection 
for  accessing.  It  does  not  seem  that  an  extra  level  of  indirection  for 
scalars  will  improve  the  access  efficiency  that  is  potentially  available  for 
arrays,  because  scalar  variables  do  not  have  indices  to  be  manipulated  or 
effective  addresses  to  be  computed.  Nevertheless,  the  architectural 
consistency  which  would  result  way  overrule  this  conclusion  upon  closer 
examinat ion. 

4.7  Program  Control 

Currently,  the  nodal  processor  architecture  contains  several  uuefui 
instructions  for  transferring  program  control.  The  instructions  consist  of 
BRANCH,  JUMP,  CAbb,  RETURN,  BREAK,  turd  NEXT.  The  instructions  may  bp  u&ed  in 
both  conditional  and  unconditional  formats. 

A  BRANCH  instruction  transfers  control  to  an  instruction  located  at  some 
address  relative  to  the  BRANCH  instruction;  that  vs,  it  i*  a  relative  jump. 
Conversely,  a  JUH?  instruction  transfers  control  to  an  absolute  oddrcas  in 
the  instruction  memory.  CAbb  transfers  control  to  an  absolute  address  a« 
well,  but  it  also  keeps  track  of  a  number  of  things  Idiacnssed  in  Sec¬ 
tion  4.9)  useful  for  transferring  control  to  a  subroutine.  RETURN  allows 


control  to  be  transferred  back  from  a  subr-.-ut  ine  using  the  absolute  address 
stored  by  CALL.  BREAK  and  NK'T  are  essentially  BRANCH  instructions  useful  in 
controlling  program  loops.  (They  will  be  discussed  in  that  context  in 
Section  4.8.) 

The  conditional  versions  of  these  control-transferring  instructions  make 
use  of  a  set  of  condition  codes.  The  condition  codes  consist  of  bits  that 
indicate  whether  a  result  from  an  operation  was  positive,  negative,  zero, 
caused  a  carry  or  an  overflow,  etc.  The  condition  codes  are  set  by  an 
operation  resulting  from  an  instruction  that  had  its  test  flag  set.  Codes 
from  subsequent  instructions  that  had  their  test  flags  set  are  ORed  with  the 
existing  condition  codes.  The  codes  are  reset  (cleared)  by  a  conditional 
control-transferring  instruction,  unless  that  instruction  indicates  (via  a 
flag)  chat  the  codes  are  not  to  be  altered.  Finally,  the  architecture  should 
support  instructions  that  can  store  the  condition  codes  in  data  memory  and 
can  restore  the  condition  codes  from  data  memory. 

a. 8  Loop  Control 

The  image  processing  applications  for  the  nodal  processor  will 
doubtlessly  lead  to  programs  containing  many,  relatively  short,  nested 
loops.  Consequently,  it  is  important  for  the  architecture  to  support  program 
loops  in  a  very  efficient  manner.  These  loops  will  have  the  characteristic 
that  the  number  of  times  they  eseruted  is  independent  of  Cite  data  being 
processed.  Thus,  in  «  typical  architecture,  a  register  is  initialised  to 
contain  the  number  of  times  chat  the  instructions  in  the  loop  will  be 
executed.  Then  the  register  is  decremented  at  the  bottom  of  the  loop  and 
tcated  to  determine  whether  to  branch  back  to  the  top  of  the  loop.  Every 
time  the  loop  is  executed,  the  decrement,  test,  and  conditional  branch, 
instructions  are  also  executed.  For  short  loops,  these  instructions  can 
represent  a  significant  overhead  compared  to  the  useful  computation  carried 
out  within  the  loop. 

For  most  of  the  loops  used  in  image  processing  software,  u  simple 
architectural  feature  can  be  used  to  support  the  overhead  for  loop  control. 
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Since  loops  are  perfectly  nested,  a  stack  of  loop  counters  can  be  usee  to 
keep  track  of  the  current  iteration.  When  a  loop  is  initialized  (by  a  single 
machine-language  instruction),  an  initial  value  for  the  loop  counter  is 
pushed  on  the  stack,  At  the  end  of  the  loop,  another  instruction  causes  the 
loop  counter  to  be  decremented  and,  if  it  is  3t i 1 1  positive,  control  to  be 
transferred  to  the  top  of  the  loop.  The  top-of-loop  address  can  be  set  up  at 
loop  initialization  in  a  stack  parallel  to  the  loop  counter  stack.  When  the 
loop  counter  finally  reaches  zero,  the  loop  is  exited  and  the  stacks  holding 
the  loop  counter  and  the  top-of-loop  address  are  popped. 

Under  certain  circumstances,  it  is  desirable  to  exit  prematurely  from 
one  iteration  of  the  loop  and  to  begin  the  next.  The  NEXT  instruction,  which 
may  be  conditionally  executed,  performs  this  function.  It  may  be  desirable 
to  exit  a  loop  entirely  before  all  the  iterations  have  been  completed.  The 
BREAK  instruction,  also  conditionally  executable,  can  be  used  for  this 
purpose . 

The  NEXT  and  BREAK  instructions  are  essentially  conditional  BRANCH 
instructions.  The  NEXT  instruction  branches  to  the  end-of-loop  instruction, 
which  decrements  the  loop  counter  and  tests  it  ae  usual.  The  BREAK 
instruction  branches  out  of  the  loop,  but  it  must  also  pop  the  loop-counter 
stack  and  the  top-of-loop  address  stack.  The  offsets  needed  for  calculating 
the  next  instruction  address  can  be  computed  by  a  compiler  or  assembler. 
Although  it  is  not  necessary,  there  may  be  some  speed  advantage  to  storing 
the  transfer  aduresses  for  BREAK  and  NEXT  instructions  on  stacks  parallel  to 
the  loop  counter  stack.  This  impleraentat ional  issue  will  have  to  be  examined 
in  more  detail. 

Loops  in  which  the  number  of  iterations  is  dat a-dependent  can  also  be 
implemented  with  the  specialized  loop  instructions.  An  infinite  loop  could 
be  set  up  by  setting  an  "infinity"  bit  in  the  loop  counter.  The  loop  would 
be  exited  by  using  a  conditional  BREAK  instruction. 

Structured  programming  texts  argue  for  testing  at  the  top  of  the  loop 
rather  than  the  bottom  as  a  defensive  programming  tactic.  It  may  be  prudent 
to  incorporate  other  specialized  loop  control  mechanisms  in  the  architectural 
requirements  to  support  both  top-of-loop  and  bot tom-of-loop  testing. 
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4.9  Subroutine  Linkage 


Scructured  software  tends  to  have  many  subroutines  and  consequently 
many  subroutine  calls.  Thus,  it  is  important  to  have  an  efficient  subroutine 
linkage  mechanism  so  that  the  overhead  for  short  subroutines  is  not  too 
large.  One  subroutine  linkage  mechanism  we  have  been  exploring  is  an 
elaboration  of  those  used  on  stack-oriented  computers  such  as  the  HP-3000. 

At  present,  it  actually  involves  seven  parallel  stacks. 

The  seven  stacks  are  called  the  instruction  address  stack  (l-stack),  the 
data  stacx  (D-stack),  the  array  access  register  stack  (A-stack),  the  data 
frame  stack  (DF-stack),  the  AAR  frame  stack  (AF-stack),  the  number-of-data- 
parameters  stack  (#DP-stack),  and  the  number-of-AAR-parameters  stack  (#AP- 
stack).  Figure  4-3  is  an  outline  of  how  these  stacks  are  manipulated  during 
subroutine  calls  and  returns. 

The  simplest  stack  to  understand  is  the  1-slack.  When  a  subroutine  is 
called,  control  is  passed  to  the  starting  address  of  the  subroutine  and  the 
return  address  is  pushed  onto  the  I-stack.  When  a  RETURN  instruction  is 
executed  in  the  subroutine,  the  return  address  is  popped  off  the  I-stack  and 
into  the  program  counter. 

The  D-stack,  the  DF-stack,  and  the  #DP-stack  are  closely  related.  At 
any  given  level  of  subroutine  nesting,  the  D-stack  contains  the  information 
needed  for  the  current  subroutine  context.  The  data  stack  pointer,  which 
itself  sits  atop  the  DF-stack,  points  to  the  top  of  the  D-stack  and  is 
updated  when  data  are  pushed  on  or  popped  off  the  D-st3ck.  One  of  the 
ad dr  ssing  modes  allows  data  to  be  accessed  with  a  negative  offset  relative 
to  the  data  stack  pointer.  Data  on  the  D-stack  may  also  be  accessed 
relative  to  the  data  frame  pointer,  which  itself  resides  just  below  the  data 
stack  on  the  DF-stack.  The  data  frame  pointer  points  to  the  top  of  the 
parameter  list  for  the  current  subroutine. 

To  call  a  subroutine,  space  is  first  allocated  on  the  D-stack  to  hold 
any  values  to  be  returned  by  the  subroutine.  This  is  accomplished  simply  by 
incrementing  the  data  stack  pointer.  Next,  the  parameters  specified  by  the 
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arguments  of  the  CALL  instruction  are  pushed  onto  the  D-stack  and  the  number 
of  parameters  is  pushed  onto  the  #DP-stack.  When  the  CALL  is  executed,  a 
duplicate  copy  of  the  data  stack  pointer  is  pushed  onto  the  DF-stack,  making 
the  new  data  frame  pointer  equal  to  the  old  data  stack  pointer.  After 
control  is  transferred  to  the  subroutine,  the  data  stack  pointer  is  again 
incremented  to  allocate  storage  for  dynamic  variables  that  are  local  to  the 
subroutine.  Additional  temporary  storage  may  be  allocated  simply  by  pushing 
values  onto  the  D-stack  and  updating  the  data  stack  pointer. 

When  a  subroutine  RETURN  is  executed,  the  DF-stack  is  popped  restoring 
the  old  data  frame  pointer  and  the  data  stack  pointer  to  the  state  that 
existed  after  the  subroutine  parameters  were  pushed  on  the  D-stack  but 
before  the  subroutine  was  actually  called.  The  number  of  data  parameters  is 
then  popped  off  the  top  of  the  #DP-stack  and  subtracted  from  the  data  stack 
pointer  to  effectively  de-allocate  the  memory  used  to  pass  parameters  to  the 
subroutine . 

The  other  three  stacks — the  A-stack,  the  AF-stack,  and  the  #AP-stack — 
are  handled  in  a  fashion  analogous  to  their  data  stack  counterparts.  Rather 
than  holding  data  values,  however,  the  A-stack  contains  the  names  of  AARs, 
which  in  turn  point  to  arrays  of  data  values. 

As  we  mentioned  earlier,  it  is  possible  to  use  the  AAR  concept  to  access 
scalars  as  well  as  arrays.  If  this  were  done,  the  resulting  architectural 
consistency  would  permit  the  reduction  from  seven  stacks  to  four  stacks.  The 
separate  D-stack,  DF-stack,  and  #DF-atack  would  not  be  needed. 

4.10  Input /Output 

There  are  two  aspects  of  nodal  processor  1/0:  communication  with  the 
outside  world  and  communication  with  other  nodal  processors  in  the  multi¬ 
processor  system.  Here  we  shall  limit  our  discussion  to  the  latter.  In 
terms  of  architectural  requirements,  we  want  the  nodal  processor  to  handle 
the  protocols  for  I/O  as  little  as  possible  for  two  primary  reasons.  First, 
we  do  not  want  to  slow  the  nodal  processor  with  bookkeeping  tasks  related 
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to  the  I/O  and  second,  we  want  the  flexibility  to  interface  improved 
interprocessor  connection  networks  easily  ss  they  are  developed. 

These  requirements  argue  for  a  direct  memory  access  (DMA)  capability 
with  a  separate  I/O  processor  to  handle  the  necessary  communications 
protocols.  The  nodal  processor  should  simply  be  able  to  request  that  data 
be  sent  to  another  processor  or  be  received  from  another  processor.  In 
addition,  it  should  be  possible  to  interrupt  the  nodal  processor  to  inform  it 
that  data  has  been  received  from  another  processor.  A  great  deal  remains  to 
be  done  in  the  specification  of  architectural  features  to  support  both 
interprocessor  I/O  as  well  as  system  I/O. 

4.11  Summary  of  the  Multi-Processor  Architecture 

The  architectural  specifications  for  a  multi-processor  system  are  far 
from  complete.  As  described  in  the  previous  sections,  we  have  begun  looking 
at  some  of  the  architectural  requirements  for  the  nodal  processors.  In  FY83 
we  plan  to  continue,  with  the  immediate  goal  of  specifying  a  nodal  processor 
instruction  set  and  architectural  features  to  permit  its  efficient 
implementation.  Major  areas  of  concern  include  I/O,  array  access,  loop 
control,  and  subroutine  linkage.  F.ase  of  programming,  flexibility  of  the 
instruction  set,  and  the  ability  to  incorporate  special-purpose 
computational  hardware  are  also  important. 

The  16  features  listed  below  contribute  to  our  goals  of  high 
computational  throughput  and  ease  of  programming.  Much  more  study  needs  to 
be  done  and  further  development  may  alter  the  desirability  of  these  features. 
Nevertheless,  at  this  stage,  these  are  important  specifications  of  a  system 
architecture  for  image  processing  applications. 

1.  Multi-processor  architecture  consisting  of  16  nodal  processors 
communicating  via  an  interprocessor  communications  network. 

2.  Butterfly-type  communications  network. 
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3.  Modular  interface  between  nodal  processors  and  communications 
network. 

4.  "Horizontal"  architecture  for  nodal  processor. 

5.  Dynamic  allocation  of  data  memory  using  data  stacks. 

6.  Thirty-two-bit-wide  data  words. 

7.  Flexible  instruction  set  implemented  in  micro-code. 

8.  Pipelining  of  instruction  fetch,  decode,  and  execution. 

9.  Support  for  special  purpose  computational  elements. 

10.  Three-address  instructions. 

11.  Array  handling  instructions. 

12.  Stack-oriented  data  access  with  separate  hardware  for  address 
computation. 

13.  AARs  for  rapid  access  of  array  elements. 

14.  Specialized  instructions  for  loop  control. 

15.  Stack-oriented  subroutine  linkage. 

16.  Separate  I/O  processor  to  support  DMA  communications. 
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APPENDIX 

DERIVATION  OF  CONSTANT  FALSE-ALARM  RATE  DETECTION 


In  this  appendix,  we  derive  the  general  functional  form  of  A  to  maintain 
a  constant  level  of  significance  in  Equation  3,  regardless  of  the  background 
statistics  (i.e.,  regardless  of  the  covariance  K).  Without  loss  of 
generality,  we  shall  assume  a  zero-mean  process.  Then  from  Equations  2,  3, 
and  15a,  we  have: 


a 


p(x)<A  (2 n) 


1 

W2, 


172 


exp 


T  -1 
x  K  x 


dx] 


f  1  ,  1  -1  ,T  -1.  -1 

/  - rrrx - rr,  exp  {-  T  (L  x)  D  (L  x)  ]  dx 

p(x)<\  (2i:)N/2lKj1/fc  2  _  ~ 


(A- 1 ) 


Now,  let 


c  -  L-ix  (A-2) 

#o  that,  since  L.  *  has  unit  diagonals,  using  the  method  of  Jacobians  (6) ,  we 
have, 


de  *  dx 


(A- 3) 


Thus,  substituting  Equations  A-2  and  A-3  into  A-l,  we  obtain 


t  1  ,  1  T  -1 

, - ian - T7T  exP  l"  7  «  D  «lde  “  a 

p(Le)<\  (2'-)N/2|k{  1/2  2  ' 


(A- A) 


101 


tmaoxLM  pack  BLAfcK-Nor  nuso 


-1/2 

Furthermore,  we  have  (since  D  is  diagonal): 


r  l  r  1  ,  -1/2  vT.  -1/2  ... 

J  - - rrr  exP  l- -r  (D  e)  (D  e)Jde 

p(Le)<X  (2*)N/i|Kj17Z  T  ~  ~ 


/  - UTJ - ny  exp  [-  -L  STe)  ]o|1,,2<i£  "  a  (A-5) 

pUD^XA 


where  we  have  used  the  substitutions: 


a  -1/2 
e  *  D  e 


(A-6a) 


,  A  ,„|-1/2  J 

de  ■  D  de 


Noting  that  | K |  =*  |l)  j  [3),  we  have  from  Equation  A-5, 


1  r  1  aTa,  ,A 


p(U>i/2e)<X  <2,) 


(*  A  •  A  i  .A 

-  ^  e  e  J  de  *  a 


(A-6b) 


(A-6c) 


Note  that  the  integrand  in  Equation  A-6  does  not  depend  on  the  statistics  of 
jc.  Furthermore,  the  boundary  of  our  transformed  critical  region  is,  from 
Equation  A-6,  given  by  the  equation: 


pfLD^e  )  «  X 
-o 


(A- 7) 


which,  from  Equations  2  and  15a,  can  be  expressed  as 


exp  [-  -y  (LD1/2e^)  L_I  D~ ‘l" l( LDI/2£o) ] 


(2i)N7^TI7T  tXP 


( A-8a) 
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or 


-  y  -  An[(2i)N/2|KiU2xl  (A-8b) 

Therefore,  to  maintain  a  constant  level  of  significance,  we  miat  have: 

tnl(2»)S/,2|K|  ^2X]  »  constant  (A-9a) 


or 


X 


as  we  had  proposed  earlier  in  Equation  12. 


(A-9b) 
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